Reputation: 13145

Establishing the name associated with a HTML link

I wish to download the files associated with a set of links in a html document.

A link might appear like this:

<a href="d?kjdfer87">

But when I click on it in my browser, I get the following file downloaded:

file2.txt

The following will download the file via python:

opener = urllib.request.build_opener()
r = opener.open("unknown.txt")
r.read()

but how do I establish that the file was actually called file2.txt?

Upvotes: 0

Answers (4)

Reputation: 13145

It's actually this simple:

r.info().get_filename()

Upvotes: 1

Reputation: 18780

The Content-Disposition header in the HTTP response is what specified that the response should be downloaded with a specific filename.

Upvotes: 0

Reputation: 184345

Check the Content-Disposition header on the response. It can suggest a filename. I believe this would be in r.info().dict['Content-Disposition'].

Upvotes: 2

Reputation: 599876

I'm not sure why you think you need the name. You should call it in exactly the same way as the browser does, ie with the value in the href.

Upvotes: 0