Reputation: 21
I am trying to download PDF files from a list of URLs in a .txt file, with one URL per line. ('urls.txt')
When I use the following command, where the URL I used is an exact copy-paste of the first line of the .txt file:
$ curl http://www.isuresults.com/results/season1617/gpchn2016/gpchn2016_protocol.pdf -o 'test.pdf'
the pdf downloads perfectly. However when I use this command:
xargs -n 1 curl -O < urls.txt
Then I receive a 'curl: (3) URL using bad/illegal format or missing URL' error x times the amount of URLs listed in the .txt file. I have tested many of the URLS individually, and they all seem to download properly.
How can I fix this?
Edit - the first three lines of urls.txt reads as follows:
http://www.isuresults.com/results/season1718/gpf1718/gpf2017_protocol.pdf
http://www.isuresults.com/results/season1718/gpcan2017/gpcan2017_protocol.pdf
http://www.isuresults.com/results/season1718/gprus2017/gprus2017_protocol.pdf
SOLVED: As per the comment below, the issue was that the .txt file was in DOS/Windows format. I converted it using the line:
$ dos2unix urls.txt
and then the files downloaded perfectly using my original line of code. See this thread for more info: Are shell scripts sensitive to encoding and line endings?
Thank you to all who responded!
Upvotes: 0
Views: 6454
Reputation: 6426
Try using
xargs -n 1 -t -a urls.txt curl -O
here the -a option reads list from a file rather than standard input
EDIT:
As @GordonDavisson mentioned, it looks like you may have a file with DOS line endings, you can potentially clean these up using sed before passing to xargs
sed 's/\r//' < urls.txt | xargs -n 1 -t curl -O
Upvotes: 1