Hadi
Hadi

Reputation: 574

How to get the file extension from its download link by Java?

I want to get the extensions of a few files from their download links.

Download links does not contain the extensions of their files. For example, a link looks like below:

http://yourshot.nationalgeographic.com/u/fQYSUbVfts-T7odkrFJckdiFeHvab0GWOfzhj7tYdC0uglagsDNfNYI4FFesWV5zeSPtcfpyHzKZI7dHjkluwtIYNkXOGmjh43Ktdn0VeBWhQ-9l2kheOPt5N2TM3yPEW4tTrtFFqniatwxxhbqsc78IU2pBaqWwyEVLeQx64zSda2CNGmUpSxyte_tamVoIk3y4zXisQ-vjmMp6n1BAB3nbUVlwWg/

I tried to get the files extension using myHttpUrlConnection.getContentType(), but the result was not the result what I want.

Some download links return a phrase like “text/plain”, ”application-octet-stream”,multipart/form-data ,. But I just want correct and clear type, like rar, mp4, txt, jpeg,mkv, zip, png, apk, mp3, .

Upvotes: 3

Views: 1481

Answers (1)

syntagma
syntagma

Reputation: 24344

You cannot do that. The getContentType() method simpy:

Returns the value of the content-type header field.

which in most cases is (though there is no guarantee) related to the file extension/file type, for example application/pdf would mean there is a PDF file under that URL.

Each of the file types with extension you have listed (rar, mp4, txt, jpeg,mkv, zip, png, apk, mp3) have another structure. To do reliably what you want to do, you would have to first download the whole file and then check its type based on the contents.

A good example of a library you could use is Apache Tika.

Upvotes: 3

Related Questions