Chris
Chris

Reputation: 63

Ways to convert ISO 8859-X to UNICODE

What are the options for converting ISO 8859-X to UNICODE in C++? By UNICODE I mean UNICODE code points betwen 0 and 65,535 since all ISO 8859-X are character sets which occupy that range.

The most obvious would be to get the mappings tables (http://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-7.TXT) and make a parser for it. But I suppose there are some libraries for this (I have found none)?

I know there is a trivial code for ISO-8859-1 conversion, but let's ignore it since it works with this particular ISO encoding only.

Can you share what the options are? Possibly which pros/cons of each option?

Personally, I would prefer something lightweight, since I need only the one way conversion and ISO only not a full UNICODE support.

Upvotes: 0

Views: 677

Answers (1)

Remy Lebeau
Remy Lebeau

Reputation: 597710

You can use a dedicated Unicode conversion library like ICONV or ICU.

However, if all you need is conversion from ISO-8859-X to Unicode, not the other way around, and no other charsets, then you could simply declare a static wchar_t[16][256] array containing the appropriate Unicode codepoints. There are only 16 ISO-8859 charsets defined (well, a few more if you count variants), with up to 256 values each. Then you can loop through your input string using its characters as indexes into the array.

Upvotes: 1

Related Questions