Heinrich Ulbricht
Heinrich Ulbricht

Reputation: 10372

Could there be encoding-related problems when storing unicode strings in ini files?

There are already questions regarding unicode and ini files, but many of them are rather domain-specific. So I am not sure if the answer can be applied to the general case.

Motivation: I want to use ini files for storing simple data like some numbers and some strings. The strings are provided by users (input via GUI). The software could run anywhere in the world, any language can be used. The files also can be shared between users (so they can be written to on one system, read on another and so on).

I figured that unicode in ini files should be no problem when using GetPrivateProfileStringW and WritePrivateProfileStringW (I am targeting systems >= Windows XP).

But then I stumbled upon an answer in this question.

Quote:

The WritePrivateProfileStringW function will write the INI file in legacy system encoding (e.g. Shift-JIS on a Japanese system) because it is a legacy support function. If you want to have a fully Unicode-enabled INI file, you will need to use an external library.

I am unsure now - do I have to worry? Or can I just go ahead and use ini files?

Edit:

It seems the key to avoid random encodings might be to prepare an empty file containing a BOM, then using this file. Has anyone (positive/negative) experience with this?

Upvotes: 2

Views: 2454

Answers (2)

Heinrich Ulbricht
Heinrich Ulbricht

Reputation: 10372

The answer is: Yes, there can be problems depending on whether the file already exists and (if it exists) how its content is encoded.

An ini file is treated as Unicode if its content is already Unicode. Internally this seems to be determined by the IsTextUnicode function. And for this function the right BOM in the file serves as a big hint towards Unicode. So just by using WritePrivateProfileStringW you cannot ensure to write Unicode to the ini file, instead you have to prepare the file.

Source: Michael Kaplan's Blog

Upvotes: 1

Frédéric Hamidi
Frédéric Hamidi

Reputation: 262939

The problem is not really with the use of ini files but with the functions you'll use to read from and write to those files.

As you noticed, WritePrivateProfileStringW() will not write UNICODE data to the file. Instead, it will use whatever multi-byte encoding is standard on the system. That means that ini files created on a Japanese system won't be readable on a Russian system. The reverse is also true.

If the files are not intended to be shared by systems with different encodings, you'll be fine. Otherwise, maybe you shouldn't use ini files but a more UNICODE-aware technology, like e.g. XML, whose encoding defaults to UTF-8 on all platforms.

Upvotes: 1

Related Questions