Reputation: 31
I will download a lot of files from a server with wget. But the files should only be stored when the file name is in a given list. Otherwise wget should stop getting these file and start the next one.
I tried the following:
#!/bin/bash
etsienURL="http://www.etsi.org/deliver/etsi_en"
etsitsURL="http://www.etsi.org/deliver/etsi_ts"
listOfStandards=("en_302571" "en_3023630401" "en_3023630501" "en_3023630601" "en_30263702" "en_30263703" "en_302663" "en_302931" "ts_10153901" "ts_10153903" "ts_1026360501" "ts_1027331" "ts_10286801" "ts_10287103" "ts_10289401" "ts_10289402" "ts_102940" "ts_102941" "ts_102942" "ts_102943" "ts_103097" "ts_10324601" "ts_10324603")
wget -r -nd -nc -e robots=off -A.pdf $etsienURL
wget -r -nd -nc -e robots=off -A.pdf $etsitsURL
for file in *.pdf
do
relevant=false
for t in "${listOfStandards[@]}"
do
if [[ $(basename "$file" .pdf) == *"$t"* ]]
then
relevant=true
break
fi
done
if [ $relevant == false ]
then
rm "$file"
fi
done
With this code all files will be downloaded. After the download the script checks, if the filename or a part of this is in the list. Otherwise the script delete the file. But this cost a lot of disc space. I will only download a file, if the file name contains one if the list items.
Perhaps somebody can help to find a solution.
Upvotes: 2
Views: 471