Reputation:
I want download a file via HTTP, but all the examples online involve fetching the data and then putting it in a local file. The problem with this is that you need to explicitly set the filetype of the local file.
I want to download a file but I won't know the filetype of what I'm downloading.
This is what I currently have:
urllib.urlretrieve(fetch_url,output.csv)
But if I download, say a XML file it will be CSV. Is there anyway to get python to detect the file that I get sent from a URL like: http://asassaassa.com/assaas?abc=123
Say the above URL gives me an XML I want python to detect that.
Upvotes: 1
Views: 2665
Reputation: 714
You can use python-magic to detect file type. It can be installed via "pip install python-magic".
I assume you are using python 2.7 since you are calling urlretreieve. The example is geared to 2.7, but it is easily adapted.
This is a working example:
import mimetypes # Detects mimetype
import magic # Uses magic numbers to detect file type, and does so much better than the built in mimetypes
import urllib # Your library
import os # for renaming your file
mime = magic.Magic(mime=True)
output = "output" # Your file name without extension
urllib.urlretrieve("https://docs.python.org/3.0/library/mimetypes.html", output) # This is just an example url
mimes = mime.from_file(output) # Get mime type
ext = mimetypes.guess_all_extensions(mimes)[0] # Guess extension
os.rename(output, output+ext) # Rename file
Upvotes: 3