Reputation: 5606
I'm attempting to download files from a website that uses a CDN for distribution. The URLs on the download page all end with file.pdf but clicking on the link in a browser results in the download of a file with a descriptive file name (e.g. 'invoice1234.pdf'). Obviously parsing the URL to get the file name results in every file being named file.pdf - I would like to use the same file name that is used when downloading via the browser. My code looks something like this:
filename = File.basename(download.href)
agent.pluggable_parser.default = Mechanize::Download
agent.get(mov_download_link.href).save("#{path}/#{filename}")
agent.pluggable_parser.default = Mechanize::File
Any ideas would be appreciated!
Upvotes: 2
Views: 1306
Reputation: 54992
That filename is probably in a header that looks like this:
{'content-disposition' => 'filename="invoice1234.pdf"'}
If so:
f = agent.get(mov_download_link.href)
filename = f.header['content-disposition'][/"(.*)"/, 1]
f.save("#{path}/#{filename}")
Upvotes: 2