RALPH BURAIMO
RALPH BURAIMO

Reputation: 15

how to use wget with Many URL's in .txt file to download and save as

i have a txt file with too many direct links to be downloaded including each names of the files in front of each url, the txt file looks like:

http://example.com/file1.png name_of_the_file1
http://example.com/file2.mp4 name_of_the_file2
http://example.com/file3.mkv name_of_the_file3
http://example.com/file4.png name_of_the_file4
http://example.com/file5.avi name_of_the_file5

As you can see the filename and url are separated by space.

what I want is a linux command that enters the txt file containing the urls and download each file and then rename them to their respective name using wget.

Please help me, any help would be appreciated, thanks!

Note1: there is exactly one space between the url and filename

Note2: the filename may contain spaces: See example bellow

http://example.com/47188.png Abaixo de Zero (2021)

Upvotes: 0

Views: 1741

Answers (4)

devOps
devOps

Reputation: 80

    while IFS= read -r line; do
        IFS=' '
        read -a strarr <<< "$line"
        if [[ ${#strarr[@]} -gt 2 ]]
        then
                filename=''
                for (( i=${#strarr[@]}; i>0;  i-- ));
                do
                        filename="${strarr[i]} $filename"
                done
                wget ${strarr[0]} -O "$filename"
        else
                wget ${strarr[0]} -O ${strarr[1]}
        fi
done < filename.txt

This code was modified and now it's able to create files containing more than one word in filename. But this code is not very clear because I don't know all features of bash scripts . I'm actually using python.

Upvotes: 0

You can use this awk|xargs one-liner:

awk '{url=$1; $1="";{sub(/ /,"");out=url" -O \""$0"\""}; print out}' file.txt | xargs -L 1 wget

Explanation:

    url=$1 # temp var inside awk
    $1="" # replace url with null space
    {sub(/ /,"");out=url" -O \""$0"\""} # need to output var
        sub(/ /,"") # need to trim leading white space
        out=url" -O \""$0"\"" # line formatting with escaping characters
    print out # シ
    xargs -L 1 wget # get awk output line by line to wget
plus some awk sintax sugar

Example:

cat << EOF >> file.txt
https://www.openssl.org/source/old/1.1.1/openssl-1.1.1k.tar.gz name_of_the_file2
https://www.openssl.org/source/old/1.1.1/openssl-1.1.1j.tar.gz name of_the_file3
https://www.openssl.org/source/old/1.1.1/openssl-1.1.1i.tar.gz name of the_file4
https://www.openssl.org/source/old/1.1.1/openssl-1.1.1h.tar.gz name of the file5
EOF
awk '{url=$1; $1="";{sub(/ /,"");out=url" -O \""$0"\""}; print out}' file.txt | xargs -L 1 wget
ls -1
name_of_the_file2
'name of_the_file3'
'name of the_file4'
'name of the file5'

Upvotes: 1

devOps
devOps

Reputation: 80

you can use this code:

while IFS= read -r line; do
        IFS=' '
        read -a strarr <<< "$line"
        wget -O ${strarr[1]} ${strarr[0]}
done < filename.txt

it's a bash scripts . But if you don't know how to use it:

  1. Paste it in a file.sh
  2. run this command in order to be able to execute: chmod +x file.sh
  3. execute: ./file.sh

P.S don't forget to change filename to use actual filename that contains links.

Upvotes: 0

avika81
avika81

Reputation: 91

The simplest thing I can think of is the following simple python script:

import os
lines = open('<name_of_your_file>').readlines()
for line in lines:
    url, file_name = line.strip().split(' ', 1)
    os.system(f'wget {url} -o {file_name}')

If one wants it in a one liner bash, the following works:

$ python -c "import os; lines = open('<name_of_your_file>').readlines(); [ os.system(f'wget {url} -o {file_name}') for url, file_name in [line.strip().split(' ', 1) for line in lines]]"

Upvotes: 0

Related Questions