Reputation: 37
I want to have a script that gets a file name and checks if it's a file. A file ends with .txt, .exe etc'. There is any library or module in python that include ALL the file formats? If there isn't, how can I verify that the given input (like: hey.txt, what.exe etc') is a file? P.S I'm checking files of a website, not an operation system file (like: "https://www.magshimim.net/App_Themes/En/images/powered_by_priza_heb.gif" Thanks to all the helpers :)
Upvotes: 0
Views: 179
Reputation: 4236
I suggest:
import os.path # Use any path (ntpath, posixpath, ...) module that uses "." as an extension separator instead to be sure (if you want)
filename, ext = os.path.splitext(inputname)
# If filename and ext are both full, then it is a filename like 'something.txt'
# If only ext is there, and filename is not, then filename is something like '.bashrc' or '.ds_store'
# If there is no ext, only filename, then a file doesn't have an extension
# So:
if filename and ext: print "File", filename, "with extension", ext
elif ext and not filename:
filename = ext; ext = ""
print "File", filename, "with no extension!"
else: print filename, "is not a file by 'must have an extension' rule!"
You can also achieve the check with something like:
c = inputname.count(".")
if c!=0 and not inputname.endswith(".") and not (inputname.startswith(".") and c==1):
print inputname, "is a file because it has an extension!"
else: print inputname, "is not a file, no extension!"
If you really have to check for existing format, then, yes, use mimetypes.
Or Google around, I saw somewhere pretty extensive list (as library) of all formats for PHP. Take this and convert it to Python. Few find and replaces would do it.
Upvotes: 0
Reputation: 191
If the files are located on web server, you can use Content-Type header to get type of the file.
import urllib2
urls = ['https://www.magshimim.net/App_Themes/En/images/powered_by_priza_heb.gif',
'https://www.magshimim.net/images/magshimim_logo.png']
for url in urls:
response = urllib2.urlopen(url)
print url
print response.headers.getheader('Content-type') # Content Type
print response.headers.getheader('Content-Length') # Size
print
Output should be :
https://www.magshimim.net/App_Themes/En/images/powered_by_priza_heb.gif
image/gif
1325
https://www.magshimim.net/images/magshimim_logo.png
image/png
8314
Upvotes: 2
Reputation: 19717
There is no such library because there is an unlimited number of file formats.
I can create my own .something
, and you can too, the file will still be a proper file.
Instead, you have to use os.path.isfile()
.
As @zero323 pointed it out, and according to your edit, you should use the library mimetypes.
Then, use .guess_type() which returns None
if the filetype can not be guessed.
See the full list of MIME types here.
Upvotes: 2
Reputation: 356
the best thing would be to use a regular expressions,since your script is checking whether the following object is a file or not.....if you want to check whether the particular file exists then it would be beneficial to use os.path.isfile(path)... if you are comfortable with regular expressions then try to create a regular expression,otherwise let me know i will create it for you. your feedback will be highly appreciated thank you.
Upvotes: 0