user4368224
user4368224

Reputation:

Python - Download file over HTTP and detect filetype automatically

I want download a file via HTTP, but all the examples online involve fetching the data and then putting it in a local file. The problem with this is that you need to explicitly set the filetype of the local file.

I want to download a file but I won't know the filetype of what I'm downloading.

This is what I currently have:

urllib.urlretrieve(fetch_url,output.csv)

But if I download, say a XML file it will be CSV. Is there anyway to get python to detect the file that I get sent from a URL like: http://asassaassa.com/assaas?abc=123

Say the above URL gives me an XML I want python to detect that.

Upvotes: 1

Views: 2665

Answers (1)

BSL-5
BSL-5

Reputation: 714

You can use python-magic to detect file type. It can be installed via "pip install python-magic".

I assume you are using python 2.7 since you are calling urlretreieve. The example is geared to 2.7, but it is easily adapted.

This is a working example:

import mimetypes # Detects mimetype
import magic  # Uses magic numbers to detect file type, and does so much better than the built in mimetypes
import urllib # Your library
import os     # for renaming your file
mime = magic.Magic(mime=True) 
output = "output" # Your file name without extension
urllib.urlretrieve("https://docs.python.org/3.0/library/mimetypes.html", output) # This is just an example url
mimes = mime.from_file(output) # Get mime type
ext = mimetypes.guess_all_extensions(mimes)[0] # Guess extension
os.rename(output, output+ext) # Rename file

Upvotes: 3

Related Questions