Basawaraj
Basawaraj

Reputation: 41

how to Make lower case letters for unicode characters

I have problem in converting to lower case letters for unicode characters in VC++ MFC .I have unicode characters in a CString Variable.so,with English MakeLower() works fine and I get lower case .But it cannot convert unicode characters to lower case.I did try the STL algorithm transform :

std::string data = "ИИИЛЛЛЛ"; //bulgerian chars

std::transform(data.begin(), data.end(), data.begin(), ::tolower);

but it fails to load the unicode chars ,I get "????" symbols in place of unicode chars .

Can you please let me know if there is a solution for unicode chars .I dont like to use boost libraries.Thanks in advance!

Upvotes: 4

Views: 3739

Answers (3)

Ale
Ale

Reputation: 997

I couldn't find the word lower in IDN2 documentation, but noticed that domain names are converted to lowercase. Consider this C snippet:

#include <stdio.h>
#include <idn2.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    for (int i = 1; i < argc; ++i)
    {
        char *out = NULL, *out2 = NULL;
        int rtc = idn2_to_ascii_8z(argv[i], &out, 0);
        int rtc2 = rtc == 0? idn2_to_unicode_8z8z(out, &out2, 0): -1;
        printf("%2d/%2d  %s -> %s -> %s\n", rtc, rtc2,
            argv[i], out? out: "NULL", out2? out2: "-");
        free(out);
        free(out2);
    }

    return 0;
}

It takes some fancy characters, like hwair but not a simple °:

ale@alenovo:~/tmp$ gcc -W -Wall -g -O0 lower.c -lidn2
ale@alenovo:~/tmp$ ./a.out ASCII àÃĈOÖÖ°o àÃĈOÖÖo ИИИЛЛЛЛ 𐍈ǶǶǶǶƕƕƕ
 0/ 0  ASCII -> ascii -> ascii
-304/-1  àÃĈOÖÖ°o -> NULL -> -
 0/ 0  àÃĈOÖÖo -> xn--oo-iiam0ha4k -> àãĉoööo
 0/ 0  ИИИЛЛЛЛ -> xn--h1aaamaaa -> ииилллл
 0/ 0  𐍈ǶǶǶǶƕƕƕ -> xn--6haaaaaaa57883c -> 𐍈ƕƕƕƕƕƕƕ

Upvotes: 0

MSalters
MSalters

Reputation: 179799

Try

std::wstring data = L"ИИИЛЛЛЛ"; // Wide chars

std::transform(data.begin(), data.end(), data.begin(), std::tolower<wchar_t>);

Upvotes: 1

Edward Clements
Edward Clements

Reputation: 5132

If your project uses the Unicode Character Set (project properties), CString::MakeLower() should work -- note that this will not convert the contents of the string, it returns a new string, see this MSDN article:

CString s1(_T("ABC")), s2;
s2 = s1.MakeLower();
ASSERT(s2 == _T("abc"));   

EDIT: CString::MakeLower() does change the contentrs of the string, it also returns a reference to the converted string

Upvotes: 3

Related Questions