guettli
guettli

Reputation: 28005

content-type text/plain has file extension .ksh?

Python 2.7:

>>> from mimetypes import guess_extension
>>> guess_extension('text/plain')
'.ksh'

Python 3.5:

>>> from mimetypes import guess_extension
>>> guess_extension('text/plain')
'.c'

How can I get a valid answer?

For me ".txt" would fit.

Even the filetype lib can't handle this :-(

See https://github.com/h2non/filetype.py/issues/30

Upvotes: 4

Views: 945

Answers (2)

georgexsh
georgexsh

Reputation: 16624

although the question mentioned mimetypes.guess_extension, but it actually cannot be answered with the information in that module. mime type to extension mapping is one to multi, there is no weight info in the mimetypes database, sorting extensions by alphabetical order could give a consistent answer, but apparently not what OP wants. I considered the following options:

  • by authority, IANA DB does not have extension information for every type, only a few types have this info and need hard work to parse.

  • by popularity, I hope there is one.

  • by consensus, an MDN wiki page named "Incomplete list of MIME types" is most close: it is actively maintained, it lists only one extension for some well-known mime type.

I guess the practical solution is, grab the table from the aforementioned MDN wiki, hard code those types, use mimetypes.guess_extension as a fallback.

note you should take care of MDN content license.

Upvotes: 4

Chris_Rands
Chris_Rands

Reputation: 41198

To get consistent outputs with Python 3 and 2, you need to use guess_all_extensions and sort the output:

>>> from mimetypes import guess_all_extensions
>>> sorted(guess_all_extensions('text/plain'))
['.asc', '.bat', '.c', '.cc', '.conf', '.cxx', '.el', '.f90', '.h', '.hh', '.hxx', '.ksh', '.log', '.pl', '.pm', '.text', '.txt']

.txt is the last item.

It's kinda odd these aren't already sorted since guess_extension just takes the first arbitrary extension, hence the different outputs you observe.

Upvotes: 6

Related Questions