Baz
Baz

Reputation: 13145

Establishing the name associated with a HTML link

I wish to download the files associated with a set of links in a html document.

A link might appear like this:

<a href="d?kjdfer87">

But when I click on it in my browser, I get the following file downloaded:

file2.txt

The following will download the file via python:

opener = urllib.request.build_opener()
r = opener.open("unknown.txt")
r.read()

but how do I establish that the file was actually called file2.txt?

Upvotes: 0

Views: 40

Answers (4)

Baz
Baz

Reputation: 13145

It's actually this simple:

r.info().get_filename()

Upvotes: 1

wrschneider
wrschneider

Reputation: 18780

The Content-Disposition header in the HTTP response is what specified that the response should be downloaded with a specific filename.

See: How to encode the filename parameter of Content-Disposition header in HTTP?

Upvotes: 0

kindall
kindall

Reputation: 184345

Check the Content-Disposition header on the response. It can suggest a filename. I believe this would be in r.info().dict['Content-Disposition'].

Upvotes: 2

Daniel Roseman
Daniel Roseman

Reputation: 599876

I'm not sure why you think you need the name. You should call it in exactly the same way as the browser does, ie with the value in the href.

Upvotes: 0

Related Questions