Why is WebCopy not scanning my site properly?

01 Jun 2014 (Updated: 19 May 2021)

Occasionally you might find that WebCopy isn't scanning a website properly, and the results pane shows a lot of 404 status codes.

Generally this is caused by WebCopy incorrectly combining a relative URI to the base URI.

How can I work around it?

Prior to WebCopy 1.0.9.0, the default setting was to map URI's starting with a / to the base URI of the project. If you were copying a domain, this would be fine, but if you were copying with a sub level root, then WebCopy would get it wrong.

For example, assuming that http://example.com/favicon.png is a valid URI, if you were to copy http://example.com, then a URI of /favicon.png would map correctly. However, if you had instructed WebCopy to copy http://example.com/sublevel, then it would incorrectly combine the absolute URI as http://example.com/sublevel/favicon.png and therefore fail to find the resource.

If this is happening, open the Project Properties dialog and change the If any links within this website start with the / character but do not match the website URL path then option to prefix with the website domain.

This option will be removed in a future update to WebCopy and will always map relative URI's without a host to the domain.

I tried that and it didn't work!

If this didn't help, then you might have discovered a bug. Please contact us with the URL you are trying to copy and any other information that will help us investigate and resolve the issue.