Reputation: 28005
Python 2.7:
>>> from mimetypes import guess_extension
>>> guess_extension('text/plain')
'.ksh'
Python 3.5:
>>> from mimetypes import guess_extension
>>> guess_extension('text/plain')
'.c'
How can I get a valid answer?
For me ".txt" would fit.
Even the filetype lib can't handle this :-(
See https://github.com/h2non/filetype.py/issues/30
Upvotes: 4
Views: 945
Reputation: 16624
although the question mentioned mimetypes.guess_extension
, but it actually cannot be answered with the information in that module. mime type to extension mapping is one to multi, there is no weight info in the mimetypes
database, sorting extensions by alphabetical order could give a consistent answer, but apparently not what OP wants. I considered the following options:
by authority, IANA DB does not have extension information for every type, only a few types have this info and need hard work to parse.
by popularity, I hope there is one.
by consensus, an MDN wiki page named "Incomplete list of MIME types" is most close: it is actively maintained, it lists only one extension for some well-known mime type.
I guess the practical solution is, grab the table from the aforementioned MDN wiki, hard code those types, use mimetypes.guess_extension
as a fallback.
note you should take care of MDN content license.
Upvotes: 4
Reputation: 41198
To get consistent outputs with Python 3 and 2, you need to use guess_all_extensions
and sort the output:
>>> from mimetypes import guess_all_extensions
>>> sorted(guess_all_extensions('text/plain'))
['.asc', '.bat', '.c', '.cc', '.conf', '.cxx', '.el', '.f90', '.h', '.hh', '.hxx', '.ksh', '.log', '.pl', '.pm', '.text', '.txt']
.txt
is the last item.
It's kinda odd these aren't already sorted since guess_extension
just takes the first arbitrary extension, hence the different outputs you observe.
Upvotes: 6