Reputation: 1230
I'm trying to download an entire webpage using the following command
wget -p -k www.myspace.com/
This does download the page and any images or scripts under that directory, but I'm trying to figure out how to download that page for completely offline viewing. How would I get every image, script, and style sheet linked within the source for www.myspace.com including external links?
Upvotes: 4
Views: 4812
Reputation: 651
wget -e robots=off -H -p -k http://www.myspace.com/
The -H or --span-hosts flag is necessary for a complete mirror, as the page is likely to include content on hosts outside the www.myspace.com domain. Ignore robots for good measure.
Upvotes: 10
Reputation: 7
wget -mk http://www.myspace.com/
works for me. I am not sure about myspace or whatever site you are trying to mirror specifically, but sometimes you have to pass in some other options to get around the no-robots policy. I am not going to say how to do that because it means you are doing something you shouldn't be doing. Although it is definitely possible.
Upvotes: -1