Reputation: 63
What are the options for converting ISO 8859-X to UNICODE in C++? By UNICODE I mean UNICODE code points betwen 0 and 65,535 since all ISO 8859-X are character sets which occupy that range.
The most obvious would be to get the mappings tables (http://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-7.TXT) and make a parser for it. But I suppose there are some libraries for this (I have found none)?
I know there is a trivial code for ISO-8859-1 conversion, but let's ignore it since it works with this particular ISO encoding only.
Can you share what the options are? Possibly which pros/cons of each option?
Personally, I would prefer something lightweight, since I need only the one way conversion and ISO only not a full UNICODE support.
Upvotes: 0
Views: 677
Reputation: 597710
You can use a dedicated Unicode conversion library like ICONV or ICU.
However, if all you need is conversion from ISO-8859-X to Unicode, not the other way around, and no other charsets, then you could simply declare a static wchar_t[16][256]
array containing the appropriate Unicode codepoints. There are only 16 ISO-8859 charsets defined (well, a few more if you count variants), with up to 256 values each. Then you can loop through your input string using its characters as indexes into the array.
Upvotes: 1