how to resume wget mirroring website?

I use wget to download an entire website.
I used the follwing command (in windows 7):

wget ^
 --recursive ^
 -A "*thread*, *label*" ^
 --no-clobber ^
 --page-requisites ^
 --html-extension ^
 --domains example.com ^
 --random-wait ^
 --no-parent ^
 --background ^
 --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" ^
     http://example.com/

After 2 days my little brother restarted the PC
so I tried to resume the stopped process
I added the following to the command

--continue ^

so the code looks like

wget ^
     --recursive ^
     -A "*thread*, *label*" ^
     --no-clobber ^
     --page-requisites ^
     --html-extension ^
     --domains example.com ^
     --random-wait ^
     --no-parent ^
     --background ^
     --continue ^
     --header="Accept: text/html" --user-agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0" ^
         http://example.com/

unfortunately it started a new job it downloads the same files again and write a new log file named

wget-log.1

Is there anyway to resume mirroring site with wget or do have I to start the whole thing over again?

Upvotes: 10

Views: 9068

Answers (1)

jack daniels
jack daniels

Reputation: 101

Try -nc option. It checks everything once again, but doesn't download it.

I'm using this code to download one website: wget -r -t1 domain.com -o log

I've stopped the process, I wanted to resume it, so I changed the code: wget -nc -r -t1 domain.com -o log

In the logs there is something like this: File .... already there; not retrieving. etc.

I checked logs before this and it seems that after maybe 5 minutes of this kind of checking it begins to download new files.

I'm using this manual for wget: http://www.linux.net.pl/~wkotwica/doc/wget/wget_8.html

Upvotes: 10

Related Questions