Reputation: 79635
I'm writing a small application which saves file paths to a database (using django). I assumed file paths are utf-8 encoded, but I ran into the following file name: C:\FXG™.nfo
which is apparently not encoded in utf-8.
When I do filepath.decode('utf-8')
I get the following error:
UnicodeDecodeError:
'utf8' codec can't decode byte0x99
in position 30: invalid start byte
(I trimmed the file name, so the position is wrong here).
How do I know how the file paths are encoded in a way that this will work for every file name?
Upvotes: 0
Views: 2361
Reputation: 11781
Use sys.getfilesystemencoding()
.
That should allow you to convert all paths that look ok.
However, there can always be illegally-encoded files or folders, you have to think how to deal with those in the framework of your application.
Some apps may ignore such files, others keep name as a binary blob.
Upvotes: 1