Paul Taylor
Paul Taylor

Reputation: 13190

Can you have filenames encoded using different encoding on same fileystem (on linux)

On a particular filesystem is it defined what encoding the filenames are created as or can they be created using any encoding.

i.e on one filesystem is it legal to have some filenames that are encoded as UTF-8 and some as UTF-16 or not. I am not talking about the contents of the filename, just the filename itself.

Upvotes: 2

Views: 626

Answers (1)

Mikel Rychliski
Mikel Rychliski

Reputation: 3597

Linux does not interpret filenames or paths as having any particular encoding. Filenames may contain any byte, in any order, expect for NUL (0x0) and / (0x2F). It is up to the application to decide the interpretation.

Because of the prohibition on NUL bytes, UTF-16 cannot be used in practice (it's encoded form often contains NUL bytes).

The on-disk format for NTFS requires that filenames be stored in UTF-16. In that case the iocharset mount option is used. All UTF-16 names from NTFS are converted using this encoding to be visible in the Linux filesystem API (and vice versa). The UDF, ISO9660, JFS, and FAT file systems also support storing Unicode code-points in a particular encoding, so iocharset is meaningful for them as well.

In practice, UTF-8 is most commonly used.

Upvotes: 5

Related Questions