Reputation: 57
I'm working with UIAutomation and I'm struggling with the localized BSTRs. I'm in Germany, so there are some special characters that are represented funny in the BSTRs. I'm logging the information and need to have them in UTF-8 to process later on.
I tried already every version of the answers that I could find regarding to WideCharToMultiByte, but that's just converting the funny character into an even funnier one. I'd really appreciate if anyone could tell me what I'm doing wrong, it's really bugging me.
So I tried both of the following versions and got both times this result (the upper one is the converted one, the lower the original one):
The first word should be "Schaltfläche" and the second "Fünf".
My tried code:
BSTR* origin;
_bstr_t originWrapper(*origin);
char* originChar = originWrapper;
size_t len = strlen(originChar) + 1;
int room = MultiByteToWideChar(CP_ACP, 0, originChar, -1, NULL, 0);
wchar_t* unicodeString = (wchar_t*)malloc((sizeof(wchar_t))*room);
MultiByteToWideChar(CP_ACP, 0, originChar, -1, unicodeString, room);
int size_needed = WideCharToMultiByte(CP_UTF8, 0, unicodeString, -1, NULL, 0, NULL, NULL);
char* utf8Char = (char*) malloc(size_needed);
WideCharToMultiByte(CP_UTF8, 0, unicodeString, -1, utf8Char, size_needed, NULL, NULL);
and
BSTR* origin;
_bstr_t originWrapper(*origin);
int size_needed = WideCharToMultiByte(CP_UTF8, 0, originWrapper, SysStringByteLen(*origin), NULL, 0, NULL, NULL);
std::string resultingString(size_needed, 0);
WideCharToMultiByte(CP_UTF8, 0, *origin, SysStringByteLen(*origin), &resultingString[0], size_needed, NULL, NULL);
Upvotes: 1
Views: 1775
Reputation: 126877
BSTR
is a pointer to UTF-16 (WCHAR
) character data, preceded by the string length. So, your roundtrip through narrow strings is misguided, you should straight use WideCharToMultiByte
:
std::string BSTRtoUTF8(BSTR bstr) {
int len = SysStringLen(bstr);
// special case because a NULL BSTR is a valid zero-length BSTR,
// but regular string functions would balk on it
if(len == 0) return "";
int size_needed = WideCharToMultiByte(CP_UTF8, 0, bstr, len, NULL, 0, NULL, NULL);
std::string ret(size_needed, '\0');
WideCharToMultiByte(CP_UTF8, 0, bstr, len, ret.data(), ret.size(), NULL, NULL);
return ret;
}
To check the validity of the conversion don't output the result to the console, as it doesn't support UTF-8 output by default (it interprets narrow strings not even as in CP_ACP
, but in CP_OEM
, go figure). Instead, write the output to a file and check it with a reliable editor supporting UTF-8.
Upvotes: 6