FoCMB
FoCMB

Reputation: 31

shell script - Download files with wget only when file name is in my list

I will download a lot of files from a server with wget. But the files should only be stored when the file name is in a given list. Otherwise wget should stop getting these file and start the next one.

I tried the following:

#!/bin/bash

etsienURL="http://www.etsi.org/deliver/etsi_en"
etsitsURL="http://www.etsi.org/deliver/etsi_ts"

listOfStandards=("en_302571" "en_3023630401" "en_3023630501" "en_3023630601" "en_30263702" "en_30263703" "en_302663" "en_302931" "ts_10153901" "ts_10153903" "ts_1026360501" "ts_1027331" "ts_10286801" "ts_10287103" "ts_10289401" "ts_10289402" "ts_102940" "ts_102941" "ts_102942" "ts_102943" "ts_103097" "ts_10324601" "ts_10324603")

wget -r -nd -nc -e robots=off -A.pdf $etsienURL
wget -r -nd -nc -e robots=off -A.pdf $etsitsURL
for file in *.pdf
    do
        relevant=false
        for t in "${listOfStandards[@]}"
            do
                if [[ $(basename "$file" .pdf) == *"$t"* ]]
                then
                    relevant=true
                    break
                fi
            done
        if [ $relevant == false ]
        then
            rm "$file"
        fi
    done

With this code all files will be downloaded. After the download the script checks, if the filename or a part of this is in the list. Otherwise the script delete the file. But this cost a lot of disc space. I will only download a file, if the file name contains one if the list items.

Perhaps somebody can help to find a solution.

Upvotes: 2

Views: 471

Answers (1)

FoCMB
FoCMB

Reputation: 31

Found the solution. I forgot the --no-parent tag for wget.

Upvotes: 1

Related Questions