Unicode vs. UTF-8

Question

I believe Windows currently defaults to UTF-16 for “Unicode”, but that this may not be the case in the future.

For this reason, would it be better to use

[System.Text.Encoding]::UTF8.GetString($someByteArray)

instead of the following?:

[System.Text.Encoding]::Unicode.GetString($someByteArray)

bobince · Accepted Answer

this may not be the case in the future.

Unicode isn't a potentially-variable encoding; it's just Microsoft's (sadly misleading) name for UTF-16LE.

It isn't going to change. Even if Microsoft moved towards implementing Windows APIs natively in UTF-8 or UTF-32 (something there's no sign of ever happening), System.Text.Encoding.Unicode would have to remain UTF-16LE as that is how it is defined by the .NET specification.

would it be better to use UTF8 instead of Unicode?

Use UTF8 if the byte array contains UTF-8-encoded bytes, and use Unicode if they are in UTF-16LE.

If you get to choose what encoding is used to store data at rest, UTF-8 is usually the better choice for space efficiency reasons.

Unicode vs. UTF-8

Answers (2)

Related Questions