Reputation: 4495
I am trying to download a pdf file using wget.
When I do:
wget <url>
it downloads a corrupted file however if I run wget -i test.txt
with the pdf URL inside this test txt file it works and the file is not corrupted.
Does anyone know why?
From the logs I can see the following.
In the first case, it is downloading a note found page.
Length: 11322 (11K) [text/html] Saving to: ‘media.nl?id=39194.1’
In the second it is a proper pdf.
Length: 58272 (57K) [application/pdf] Saving to: ‘media.nl?id=39194&c=4667446&h=34c63dbaaa7adc7c8a33&_xt=.pdf’
Thanks,
Upvotes: 0
Views: 2384
Reputation: 1743
I got the same issue but I changed the command to this and then it worked fine when i tested it:
Wget —-no-check-certificate https://www.roofingsuppliesuk.co.uk/core/media/'media.nl?id=39194&c=4667446&h=34c63dbaaa7adc7c8a33&_xt=.pdf'
i just added single quotes beginning at 'media.nl.......pdf'
Make sure the file with same name doesnt exist. You dont need to add --no-check-certificate if you dont get self-signed certificate error
Upvotes: 1
Reputation: 4718
Put your URL into quotes. Not quoting the URL can lead to strange effects, in your case the &
is interpreted by the shell.
E.g.
wget "https://www.roofingsuppliesuk.co.uk/core/media/media.nl?id=39194&c=4667446&h=34c63dbaaa7adc7c8a33&_xt=.pdf"
or
wget 'https://www.roofingsuppliesuk.co.uk/core/media/media.nl?id=39194&c=4667446&h=34c63dbaaa7adc7c8a33&_xt=.pdf'
or with escaping of &
wget https://www.roofingsuppliesuk.co.uk/core/media/media.nl?id=39194\&c=4667446\&h=34c63dbaaa7adc7c8a33\&_xt=.pdf
Upvotes: 2