Reputation: 337
I'm trying to mirror a website with the following URL format:
http://example.com/homepage?page=1
I want to mirror only the ones using the query string from page=1 to page=100. How do I do this as efficient as possible with wget?
I do not need to mirror recursively, only pages 1 to 100. Saving the CSS/JS will also be nice. Excluding images can be great too to keep it fast (only interested in the text).
Help?
Upvotes: 0
Views: 517
Reputation: 97918
Create a list of URLs:
seq 1 100 | xargs -n 1 -I {} echo http://example.com/homepage?page={} > URLS.txt
Then download all with wget:
wget -i URLS.txt
Upvotes: 2