Reputation: 1970
I want to classify file types based on their extensions in python.Before writing it up myself i wanted to check if there is any python package which can be used for this purpose. By file type i mean to classify it as eg. Doc,ppt,pdf,tar,txt,iso etc. ideally it would take the file name as input and return its type.i am running on linux
Upvotes: 1
Views: 758
Reputation: 9568
You should look into a document metadata parser. I have used Apache Tika which is a java library in some of my projects. You can look at this question Python-based document metadata parser? to see how to use it in Python
Upvotes: 2
Reputation: 7333
In Linux you can use 'file' utillity which determine file type. So if you want you can use it and in your scripts too:
import subprocess
subprocess.call(['file', 'yourfile'])
Upvotes: 1