Abdelwahed
Abdelwahed

Reputation: 1712

How to convert form UTF-8 to Latin/Arabic and vice versa?

Is there a cross-platform way to convert from UTF-8 to Latin/Arabic and from Latin/Arabicto UTF-8 in C++?

Upvotes: 4

Views: 1574

Answers (2)

Jan Hudec
Jan Hudec

Reputation: 76276

There is not, but there is a cross-platform way to convert between unicode represented in wchar_t (which is 16-bit on Windows and 32-bit on most of the other platforms) and whatever is set as locale character encoding in the system using wcstombs/mbstowcs routines from standard C library or codecvt facet of locale in standard C++ library. The conversion between wchar_t, where each element is one codepoint and utf-8 is than quite simple. So you can write or copy from somewhere a routine to convert between utf-8 and unicode in wchar_t and combine it with wcstombs/mbstowcs.

Upvotes: 0

Christopher Creutzig
Christopher Creutzig

Reputation: 8774

There are libraries like icu available. But Erik is, of course, right: The round-trip from Unicode through ISO 8859-6 will be lossy. (Yes, UTF-8 is “Unicode.” UTF-16, is “Unicode,” too, just having different bit-patterns for the same code number. See Joel Spolsky's text if you didn't know that. Or if you haven't read it yet, it's good material.)

Upvotes: 3

Related Questions