Reputation:
I am using following command to download a single webpage with all its images and js using wget
in Windows 7:
wget -E -H -k -K -p -e robots=off -P /Downloads/ http://www.vodafone.de/privat/tarife/red-smartphone-tarife.html
It is downloading the HTML as required, but when I tried to pass on a text file having a list of 3 URLs to download, it didn't give any output, below is the command I am using:
wget -E -H -k -K -p -e robots=off -P /Downloads/ -i ./list.txt -B 'http://'
I tried this also:
wget -E -H -k -K -p -e robots=off -P /Downloads/ -i ./list.txt
This text file had URLs http://
prepended in it.
list.txt
contains list of 3 URLs which I need to download using a single command. Please help me in resolving this issue.
Upvotes: 41
Views: 73933
Reputation: 1633
The existing answers here are really helpful, but what if you end up with duplicates?
(I left this info as comments on Tek Mentor's answer, but then decided to post in the form of an answer.)
You may encounter an issue with duplicate files being created if, for example, if your URL list contains the same file name at the end of different paths:
dir0/script.js
dir1/script.js
dir2/script.js
In this case wget -i url-list.txt
will give you the following in your current directory:
script.js
script.js.1
script.js.2
...A number will be appended to each subsequent instance of script.js
to avoid a namespace collision, but it won't be clear which file came from which path.
If this happens you can either:
It's easy if the files all have the same contents and you don't need more than one. Of course you can delete them after downloading, or prevent duplicates from being downloaded to begin with using -N
or -nc
. More info here.
Example: wget -N -i url-list.txt
This is usually more useful, as in many cases you will want to preserve the relative paths. The options -x
-nH
can be used to strip the hostname and create the same directory structure. See this answer for details.
Example: wget -x -nH -i url-list.txt
Note that -i
must directly precede the file name, because the file name is an argument for -i
. This is why we place the other options sequentially first.
For more info on wget
, the Ubuntu ManPages are very helpful.
Upvotes: 0
Reputation: 74018
From man wget
:
2 Invoking
By default, Wget is very simple to invoke. The basic syntax is:
wget [option]... [URL]...
So, just use multiple URLs:
wget URL1 URL2
Or using the links from comments:
$ cat list.txt
http://www.vodafone.de/privat/tarife/red-smartphone-tarife.html
http://www.verizonwireless.com/smartphones-2.shtml
http://www.att.com/shop/wireless/devices/smartphones.html
and your command line:
wget -E -H -k -K -p -e robots=off -P /Downloads/ -i ./list.txt
works as expected.
Upvotes: 68
Reputation: 237
If you have a list of URLs separated on multiple lines like this:
http://example.com/a
http://example.com/b
http://example.com/c
but you don't want to create a file and point wget to it, you can do this:
wget -i - <<< 'http://example.com/a
http://example.com/b
http://example.com/c'
Upvotes: 2
Reputation: 3035
pedantic version:
for x in {'url1','url2'}; do wget $x; done
the advantage of it you can treat is as a single wget url command
Upvotes: 0
Reputation: 367
First create a text file with the URLs that you need to download. eg: download.txt
download.txt
will as below:
http://www.google.com
http://www.yahoo.com
then use the command wget -i download.txt
to download the files. You can add many URLs to the text file.
Upvotes: 25