user1540336
user1540336

Reputation: 65

Convert ASCII string to Unicode? Windows, pure C

I've found answers to this question for many programming languages, except for C, using the Windows API. No C++ answers please. Consider the following:

#include <windows.h>
char *string = "The quick brown fox jumps over the lazy dog";
WCHAR unistring[strlen(string)+1];

What function can I use to fill unistring with the characters from string?

Upvotes: 5

Views: 21219

Answers (6)

Mark Ransom
Mark Ransom

Reputation: 308138

If you KNOW that the input is pure ASCII and there are no extended character sets involved, there's no need to call any fancy conversion function. All the character codes in ASCII are the same in Unicode, so all you need to do is copy from one array to the other.

#include <windows.h>
char *string = "The quick brown fox jumps over the lazy dog";
int len = strlen(string);
WCHAR unistring[len+1];
int i;
for (i = 0; i <= len; ++i)
    unistring[i] = string[i];

Upvotes: 1

Dmytro
Dmytro

Reputation: 5213

This is another way to do it. It's not as direct, but when you don't feel like typing in 6 arguments in a very specific order, and remembering codepage numbers/macros to MultiByteToWideChar, it does the job. Takes 16 microseconds on this laptop to perform, most of it(9 microseconds) spent in AddAtomW.

For reference, MultiByteToWideChar takes between 0 and 1 microseconds.

#include <Windows.h>

const wchar_t msg[] = L"We did it!";

int main(int argc, char **argv)
{
    char result[(sizeof(msg) / 2) + 1];        
    ATOM tmp;

    tmp = AddAtomW(msg);
    GetAtomNameA(tmp, result, sizeof(result));
    MessageBoxA(NULL ,result,"it says", MB_OK | MB_ICONINFORMATION);
    DeleteAtom(tmp);

    return 0;
}

Upvotes: 0

pive_
pive_

Reputation: 51

You should look into MultiByteToWideChar function.

Upvotes: 2

DevSolar
DevSolar

Reputation: 70263

If you are really serious about Unicode, you should refer to International Components for Unicode, which is a cross-platform solution for handling Unicode conversions and storage in either C or C++.

Your WCHAR, for example, is not Unicode to begin with, because Microsoft somewhat prematurely defined wchar_t to be 16bit (UCS-2), and got stuck in backward compatibility hell when Unicode became 32bit: UCS-2 is almost, but not quite identical to UTF-16, the latter being in fact a multibyte encoding just like UTF-8. "Wide" format in Unicode means 32 bit (UTF-32), and even then you don't have a 1:1 relationship between code points (i.e. 32bit-values) and abstract characters (i.e. a printable glyph).

Gratuituous, losely related list of links:

Upvotes: 3

Some programmer dude
Some programmer dude

Reputation: 409166

You can use mbstowcs to convert from "multibyte" to wide character strings.

Upvotes: 0

Rup
Rup

Reputation: 34408

MultiByteToWideChar:

#include <windows.h>
char *string = "The quick brown fox jumps over the lazy dog";
size_t len = strlen(string);
WCHAR unistring[len + 1];
int result = MultiByteToWideChar(CP_OEMCP, 0, string, -1, unistring, len + 1);

Upvotes: 12

Related Questions