Peter
Peter

Reputation: 13485

Python 2/3: Get name and extension of file at URL

When I download this file: https://drive.google.com/uc?export=download&id=0B4IfiNtPKeSATWZXWjEyd1FsRG8

Chrome knows that it is named testzip2.zip and downloads it to the download folder with this name.

How can I get this name in Python (in a way that works in both Python 2.7 and 3.X)?

My previous approach:

response = urlopen(url)
header = response.headers['content-disposition']
original_file_name = next(x for x in header.split(';') if x.startswith('filename')).split('=')[-1].lstrip('"\'').rstrip('"\'')

Seems not to work reliably - it occasionally and randomly fails with KeyError: 'content-disposition', or AttributeError: 'NoneType' object has no attribute 'split'

Upvotes: 0

Views: 505

Answers (1)

Oluwafemi Sule
Oluwafemi Sule

Reputation: 38952

You can use

import re
...

content_disposition = response.headers.get('Content-Disposition')
match = re.findall(r'filename="([\w\d\.]+)"', content_disposition)
filename = match[0]

However in Python 3, there is a handy method on the HTTPMessage object to get the filename.

filename = response.headers.get_filename()  # python3

Upvotes: 1

Related Questions