uingtea
uingtea

Reputation: 6524

xargs wget extract filename from URL with Parameter

I want to do parallel downloads but the problem wget output not correct filename.

url.txt

http://example.com/file1.zip?arg=tereef&arg2=okook
http://example.com/file2.zip?arg=tereef&arg2=okook

command

xargs -P 4 -n 1 wget <url.txt

output filename

file1.zip?arg=tereef&arg2=okook
file2.zip?arg=tereef&arg2=okook

expected output

file1.zip
file2.zip

I'm new with bash, please suggest me how to output correct filename, and please don't suggest for loop or & because it blocking.

Thank you

Upvotes: 3

Views: 1497

Answers (3)

Ole Tange
Ole Tange

Reputation: 33685

With GNU Parallel it looks like this:

parallel -P 4 wget -O '{= s/\?.*//;s:.*/:: =}' {} <url.txt

Upvotes: 0

Diego Torres Milano
Diego Torres Milano

Reputation: 69218

You can use a bash function that you have to export to be seen outside the current shell

function mywget()
{
    wget -O ${1%%\?*} "'$1'"
}
export -f mywget
xargs -P 4 -n 1 -I {} bash -c "mywget '{}'" < url.txt 

Upvotes: 1

xxfelixxx
xxfelixxx

Reputation: 6602

Process your input to produce the desired command, then run it through xargs.

perl -ne - iterate over the lines of the input file and execute the inline program

-e : Execute perl one-liner

-n : Loop over all input lines, assigning each to $_ in turn.

xargs -P 4 -n 1 -i -t wget "{}"

-P 4 : Max of 4 Processes at a time

-n 1 : Consume one input line at a time

-i : Use the replace string "{}"

-t : Print the command before executing it

perl -ne '
    chomp(my ($url) = $_);                         # Remove trailing newline
    my ($name) = $url =~ m|example.com/(.+)\?|;    # Grab the filename
    print "$url -O $name\n";                       # Print all of the wget params
' url.txt | xargs -P 4 -n 1 -i -t wget "{}"

Output

wget http://example.com/file1.zip?arg=tereef&arg2=okook -O file1.zip
wget http://example.com/file2.zip?arg=tereef&arg2=okook -O file2.zip
--2016-07-21 22:24:44--  http://example.com/file2.zip?arg=tereef&arg2=okook%20-O%20file2.zip
--2016-07-21 22:24:44--  http://example.com/file1.zip?arg=tereef&arg2=okook%20-O%20file1.zip
Resolving example.com (example.com)... Resolving example.com (example.com)...     93.184.216.34, 2606:2800:220:1:248:1893:25c8:1946
93.184.216.34, Connecting to example.com (example.com)|93.184.216.34|:80... 2606:2800:220:1:248:1893:25c8:1946
Connecting to example.com (example.com)|93.184.216.34|:80... connected.
connected.
HTTP request sent, awaiting response... HTTP request sent, awaiting   response... 404 Not Found
2016-07-21 22:24:44 ERROR 404: Not Found.

404 Not Found
2016-07-21 22:24:44 ERROR 404: Not Found.

Upvotes: 0

Related Questions