Reputation: 11
as default, the std::string in my machine is GBK, and the string i wrote in program is encoding with gbk, but sometimes i recive datas from server and the datas is encoding with UTF-8, I want to determine which the chatacter set the string is using. I saw the utf-8 and gbk encoding method, it's hard to complete it by self.
Upvotes: 0
Views: 958
Reputation: 598134
To check if a std::string
contains UTF-8 content, decode it as UTF-8 and see if it fails.
To check if a std::string
contains GBK, decode it as GBK and see if it fails.
There are plenty of conversion libraries available, such as ICONV and ICU, which are usually preinstalled on most platforms. Or use platform specific APIs, like MultiByteToWideChar()
on Windows (GBK is covered by codepages 936 and 54936, and UTF-8 is covered by codepage 65001).
Or just write your own decoder (UTF-8 only takes a few dozen lines of code). You can find details about the bit layouts of UTF-8 and GBK on Wikipedia.
Upvotes: 1