tfonias74
tfonias74

Reputation: 136

Bash: Parse Urls from file, process them and then remove them from the file

I am trying to automate a procedure where the system will fetch the contents of a file (1 Url per line), use wget to grab the files from the site (https folder) and then remove the line from the file.

I have made several tries but the sed part (at the end) cannot understand the string (I tried escaping characters) and remove it from that file!

cat File
https://something.net/xxx/data/Folder1/
https://something.net/xxx/data/Folder2/
https://something.net/xxx/data/Folder3/

My line of code is:

cat File | xargs -n1 -I @ bash -c 'wget -r -nd -l 1 -c -A rar,zip,7z,txt,jpg,iso,sfv,md5,pdf --no-parent --restrict-file-names=nocontrol --user=test --password=pass --no-check-certificate "@" -P /mnt/USB/ && sed -e 's|@||g' File'

It works up until the sed -e 's|@||g' File part..

Thanks in advance!

Upvotes: 0

Views: 332

Answers (4)

jraynal
jraynal

Reputation: 517

@beliy answers looks good!

If you want a one-liner, you can do:

while read -r line; do \
wget -r -nd -l 1 -c -A rar,zip,7z,txt,jpg,iso,sfv,md5,pdf \
--no-parent --restrict-file-names=nocontrol --user=test \
--password=pass --no-check-certificate "$line" -P /mnt/USB/ \
&& sed -i -e '\|'"$line"'|d' "File.txt"; \
done < File.txt

EDIT: You need to add a \ in front of the first pipe

Upvotes: 1

beliy
beliy

Reputation: 445

Dont use cat if it's posible. It's bad practice and can be problem with big files... You can change

cat File | xargs -n1 -I @ bash -c 

to

for siteUrl in $( < "File" ); do

It's be more correct and be simpler to use sed with double quotes... My variant:

scriptDir=$( dirname -- "$0" )
for siteUrl in $( < "$scriptDir/File.txt" )
do
    if [[ -z "$siteUrl" ]]; then break; fi # break line if him empty
    wget -r -nd -l 1 -c -A rar,zip,7z,txt,jpg,iso,sfv,md5,pdf --no-parent --restrict-file-names=nocontrol --user=test --password=pass --no-check-certificate "$siteUrl" -P /mnt/USB/ && sed -i "s|$siteUrl||g" "$scriptDir/File.txt"
done

Upvotes: 2

Mario
Mario

Reputation: 1032

I see what you trying to do, but I dont understand the sed command including pipes. Maybe some fancy format that I dont understand.

Anyway, I think the sed command should look like this...

sed -e 's/@//g'

This command will remove all @ from the stream.
I hope this helps!

Upvotes: 0

zeehio
zeehio

Reputation: 4138

I believe you just need to use double quotes after sed -e. Instead of:

'...&& sed -e 's|@||g' File'

you would need

'...&& sed -e '"'s|@||g'"' File'

Upvotes: 1

Related Questions