Reputation: 3923
I use wget to download an entire website.
I used the follwing command (in windows 7):
wget ^
--recursive ^
-A "*thread*, *label*" ^
--no-clobber ^
--page-requisites ^
--html-extension ^
--domains example.com ^
--random-wait ^
--no-parent ^
--background ^
--header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" ^
http://example.com/
After 2 days my little brother restarted the PC
so I tried to resume the stopped process
I added the following to the command
--continue ^
so the code looks like
wget ^
--recursive ^
-A "*thread*, *label*" ^
--no-clobber ^
--page-requisites ^
--html-extension ^
--domains example.com ^
--random-wait ^
--no-parent ^
--background ^
--continue ^
--header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" ^
http://example.com/
unfortunately it started a new job it downloads the same files again and write a new log file named
wget-log.1
Is there anyway to resume mirroring site with wget or do have I to start the whole thing over again?
Upvotes: 10
Views: 9068
Reputation: 101
Try -nc option. It checks everything once again, but doesn't download it.
I'm using this code to download one website:
wget -r -t1 domain.com -o log
I've stopped the process, I wanted to resume it, so I changed the code:
wget -nc -r -t1 domain.com -o log
In the logs there is something like this:
File .... already there; not retrieving. etc.
I checked logs before this and it seems that after maybe 5 minutes of this kind of checking it begins to download new files.
I'm using this manual for wget: http://www.linux.net.pl/~wkotwica/doc/wget/wget_8.html
Upvotes: 10