Reputation: 13190
On a particular filesystem is it defined what encoding the filenames are created as or can they be created using any encoding.
i.e on one filesystem is it legal to have some filenames that are encoded as UTF-8 and some as UTF-16 or not. I am not talking about the contents of the filename, just the filename itself.
Upvotes: 2
Views: 626
Reputation: 3597
Linux does not interpret filenames or paths as having any particular encoding. Filenames may contain any byte, in any order, expect for NUL (0x0) and / (0x2F). It is up to the application to decide the interpretation.
Because of the prohibition on NUL bytes, UTF-16 cannot be used in practice (it's encoded form often contains NUL bytes).
The on-disk format for NTFS requires that filenames be stored in UTF-16. In that case the iocharset
mount option is used. All UTF-16 names from NTFS are converted using this encoding to be visible in the Linux filesystem API (and vice versa). The UDF, ISO9660, JFS, and FAT file systems also support storing Unicode code-points in a particular encoding, so iocharset
is meaningful for them as well.
In practice, UTF-8 is most commonly used.
Upvotes: 5