Reputation: 35
I use function iconv with option translit.
Is there transliteration from UTF-8 to CP1251 when one symbol substitutes with several symbols? Where I can search for that information? I am using iconv.
Upvotes: 1
Views: 867
Reputation: 120239
The most obvious one is
$ echo 'ß' | iconv -f UTF-8 -t CP1251//TRANSLIT
ss
In addition, if your locale is German, umlauts are transliterated according to German rules (yes transliteration is locale dependent).
$ export LC_ALL=de_DE.UTF-8
$ echo 'Füße' | iconv -f utf-8 -t CP1251//TRANSLIT
Fuesse
(Some versions will print F"usse
instead).
Upvotes: 0
Reputation: 157504
There are some, depending on the implementation and locale:
$ echo '℀⇒½' | iconv -f UTF8 -t CP1251//TRANSLIT
a/c=> 1/2
These are, respectively, U+2100 ACCOUNT OF transliterated as a/c
, U+21D2 RIGHTWARDS DOUBLE ARROW transliterated as =>
, U+00BDVULGAR FRACTION ONE HALF transliterated as 1/2
(including spaces).
I found these in the GNU libc source code, https://github.com/lattera/glibc/blob/master/locale/C-translit.h.in; different implementations may not transliterate these characters the same way if at all.
Upvotes: 3