Amir Rachum
Amir Rachum

Reputation: 79635

File path encoding in Windows

I'm writing a small application which saves file paths to a database (using django). I assumed file paths are utf-8 encoded, but I ran into the following file name: C:\FXG™.nfo which is apparently not encoded in utf-8.

When I do filepath.decode('utf-8') I get the following error:

UnicodeDecodeError: 'utf8' codec can't decode byte 0x99 in position 30: invalid start byte

(I trimmed the file name, so the position is wrong here).

How do I know how the file paths are encoded in a way that this will work for every file name?

Upvotes: 0

Views: 2361

Answers (1)

Dima Tisnek
Dima Tisnek

Reputation: 11781

Use sys.getfilesystemencoding().

That should allow you to convert all paths that look ok.

However, there can always be illegally-encoded files or folders, you have to think how to deal with those in the framework of your application.

Some apps may ignore such files, others keep name as a binary blob.

Upvotes: 1

Related Questions