PSC
PSC

Reputation: 103

How can I tell if visiting a URL would download a file of a certain mimetype?

I am building an application that tells me if visiting a URL would make a user download a file of a certain mimetype. My question is: What information (like header fields) can be used to achive this?

I was thinking about sending a HEAD-request and look for Content-Disposition and Content-Type header fields. But an attacker might just lie in this fields and because of mimesniffing my browser would still save the file.

Is there a way to get this information without downloading the file (this would cause unwanted traffic.)

EDIT: I want to develop an application that gets an URL as input. The output should be three things:
1: does visiting the URL make browsers save ("download) a file delivered by the webserver?
if 1:
2: what is the mimetype of this file?
3: what is the filename of this file?

Example:
The url https://foo.bar/game.exe visited with a browser saves the file game.exe
How could I tell (without causing huge traffic by downloading the file) that the url will: 1: make me download a file 2: application/octet-stream 3: game.exe

I already know how to make a head request. But can I really trust the Content-Disposition and Content-Type header fields? I have observed responses that did not contain a Content-Disposition field and my browser still saved the file. This would cause my application to think the URL is clear while it isn't.

Upvotes: 0

Views: 634

Answers (2)

LvB
LvB

Reputation: 121

Browsers do not guess the mime type if the type is present in the content-type header (see MDN:Mime Types)

So, you can rely on if that and/ or the content-Disposition header is present that the browser will not guess.

Now, in order to detect what it is you are getting, the best way is to request the head of the file (the first line / few bytes) and decipher the magic value from that. (e.a. the *NIX way to determine what a file is) this is more reliable and less risky than depending on the file extension...

but if you need a fool proof methode to determine if a file will be downloaded.. there is n't one I know.

Upvotes: 2

mti2935
mti2935

Reputation: 12017

This can be done using curl, with the -I option (to fetch headers only), like so:

curl -I https://www.irs.gov/pub/irs-pdf/f1040.pdf

Upvotes: 0

Related Questions