arberb
arberb

Reputation: 960

Check if a URL's mimetype is not a web page

I want to check if a URL's mimetype is not a webpage. Can I do this in Java? I want to check if the file is a rar or mp3 or mp4 or mpeg or whatever, just not a webpage.

Upvotes: 1

Views: 422

Answers (3)

D.Shawley
D.Shawley

Reputation: 59633

You can issue an HTTP HEAD request and check for Content-Type response headers. You can use the HttpURLConnection.setRequestMethod("HEAD") before you issue the request. Then issue the request with URLConnection.connect() and then use URLConnection.getContentType() which reads the HTTP headers.

The bonus of using a HEAD request is that the actual resource is never transmitted/generated. You can also use a GET request and inspect the resulting stream using URLConnection.guessContentTypeFromStream() which will inspect the actual bytes and try to guess what the stream represents. I think that it looks for magic numbers or other patterns in the stream.

Upvotes: 3

Damien_The_Unbeliever
Damien_The_Unbeliever

Reputation: 239824

There's nothing inherent in a URL which will tell you what you will receive when you request it. You have to actually request the resource, and then inspect the content-type header. At that point, it's still not clear what you should do - some content types will (almost) always be handled by the browser, e.g. text/html. Some types should be handled by a browser, e.g. application/xhtml+xml. Some types may be handled by the browser, e.g. application/pdf.

Which, if any, of these you consider to be "webpage" is still not clear - you'll need to decide for yourself.

You can inspect the content-type header once you're requested the resource, using, for example, the HttpURLConnection class.

Upvotes: 1

kosa
kosa

Reputation: 66677

content-type:text/html represents webpage.

Upvotes: 0

Related Questions