user1915050
user1915050

Reputation:

How to download multiple URLs using wget using a single command?

I am using following command to download a single webpage with all its images and js using wget in Windows 7:

wget -E -H -k -K -p -e robots=off -P /Downloads/ http://www.vodafone.de/privat/tarife/red-smartphone-tarife.html

It is downloading the HTML as required, but when I tried to pass on a text file having a list of 3 URLs to download, it didn't give any output, below is the command I am using:

wget -E -H -k -K -p -e robots=off -P /Downloads/ -i ./list.txt -B 'http://'

I tried this also:

wget -E -H -k -K -p -e robots=off -P /Downloads/ -i ./list.txt

This text file had URLs http:// prepended in it.

list.txt contains list of 3 URLs which I need to download using a single command. Please help me in resolving this issue.

Upvotes: 41

Views: 73933

Answers (5)

Mentalist
Mentalist

Reputation: 1633

The existing answers here are really helpful, but what if you end up with duplicates?

(I left this info as comments on Tek Mentor's answer, but then decided to post in the form of an answer.)

You may encounter an issue with duplicate files being created if, for example, if your URL list contains the same file name at the end of different paths:

  • dir0/script.js
  • dir1/script.js
  • dir2/script.js

In this case wget -i url-list.txt will give you the following in your current directory:

  • script.js
  • script.js.1
  • script.js.2...

A number will be appended to each subsequent instance of script.js to avoid a namespace collision, but it won't be clear which file came from which path.

If this happens you can either:

ignore duplicate file names

It's easy if the files all have the same contents and you don't need more than one. Of course you can delete them after downloading, or prevent duplicates from being downloaded to begin with using -N or -nc. More info here.

Example: wget -N -i url-list.txt

-OR-

recreate the directory structure

This is usually more useful, as in many cases you will want to preserve the relative paths. The options -x -nH can be used to strip the hostname and create the same directory structure. See this answer for details.

Example: wget -x -nH -i url-list.txt

Note that -i must directly precede the file name, because the file name is an argument for -i. This is why we place the other options sequentially first.

For more info on wget, the Ubuntu ManPages are very helpful.

Upvotes: 0

Olaf Dietsche
Olaf Dietsche

Reputation: 74018

From man wget:

2 Invoking
By default, Wget is very simple to invoke. The basic syntax is:
wget [option]... [URL]...

So, just use multiple URLs:

wget URL1 URL2

Or using the links from comments:

$ cat list.txt
http://www.vodafone.de/privat/tarife/red-smartphone-tarife.html
http://www.verizonwireless.com/smartphones-2.shtml
http://www.att.com/shop/wireless/devices/smartphones.html

and your command line:

wget -E -H -k -K -p -e robots=off -P /Downloads/ -i ./list.txt

works as expected.

Upvotes: 68

rwenz3l
rwenz3l

Reputation: 237

If you have a list of URLs separated on multiple lines like this:

http://example.com/a
http://example.com/b
http://example.com/c

but you don't want to create a file and point wget to it, you can do this:

wget -i - <<< 'http://example.com/a
http://example.com/b
http://example.com/c'

Upvotes: 2

Ardhi
Ardhi

Reputation: 3035

pedantic version:

for x in {'url1','url2'}; do wget $x; done

the advantage of it you can treat is as a single wget url command

Upvotes: 0

Tek Mentor
Tek Mentor

Reputation: 367

First create a text file with the URLs that you need to download. eg: download.txt

download.txt will as below:

http://www.google.com
http://www.yahoo.com

then use the command wget -i download.txt to download the files. You can add many URLs to the text file.

Upvotes: 25

Related Questions