Reputation: 13
I'm trying to mirror a website using wget but getting a 404 error, even though the site is accessible through a browser.
Command used:
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent --execute robots=off --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" -P D:\client-sites\maharjanmetal.com.np https://maharjanmetal.com.np/products
Error output:
--2024-10-23 21:12:43-- https://maharjanmetal.com.np/products Resolving maharjanmetal.com.np (maharjanmetal.com.np)... 149.100.146.116 Connecting to maharjanmetal.com.np (maharjanmetal.com.np)|149.100.146.116|:443... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://maharjanmetal.com.np/products/ [following] --2024-10-23 21:12:43-- https://maharjanmetal.com.np/products/ Reusing existing connection to maharjanmetal.com.np:443. HTTP request sent, awaiting response... 404 Not Found 2024-10-23 21:12:44 ERROR 404: Not Found.
What could be causing this 404 error when the site is clearly accessible through a browser? How can I successfully mirror this website using wget? Environment:
Windows 11
Expected behavior:
wget should download the website content and its assets, creating a local mirror of the site
Actual behavior:
Receiving a 404 error despite the site being accessible through browsers The command follows a 301 redirect from /products to /products/ but then fails with 404 No files are downloaded
The puzzling part is that the URL is perfectly accessible through browsers but wget consistently gets a 404 error after following the 301 redirect.
Upvotes: 0
Views: 33
Reputation: 635
I checked your target URL and confirmed it is a 404 Not found
page so obviously wget
will stop if the response is 404
,
If you still want to download this page then use the --content-on-error
flag to ignore the 404 Not found
error
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent --execute robots=off --content-on-error --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36" -P D:\client-sites\maharjanmetal.com.np https://maharjanmetal.com.np/products/
Upvotes: 0