daven11
daven11

Reputation: 3025

When a unicode key is entered is it UTF-8/UTF-16 or what?

Using windows (XP for argument sake) When you have a foreign language e.g. arabic and they type a key on the keyboard and you have an editor that stores that code in a string then is it encoded as UTF-8/UTF-16 etc?

Why I'm asking is that I'm looking at how to get unicode strings into a lua script. Lua can store utf-8 in a string. So how is the encoding performed - in the keyboard/driver before it gets to the ide, or the ide.

Please forgive the vagueness of the question. Once I have a unicode string then it's all clear it's just how the encoding gets in I'm not sure of, particularly with non US-English keyboards and I only have a US-English keyboard.

tia

Upvotes: 0

Views: 1259

Answers (2)

Hans Passant
Hans Passant

Reputation: 941465

Windows sends the WM_CHAR message to tell you that a typing key was pressed. The MSDN Library article about it is crystal clear:

The WM_CHAR message uses Unicode Transformation Format (UTF)-16.

If you need it encoded in utf8 then you'll need to translate it. Use WideCharToMultiByte() with the CodePage argument set to CP_UTF8.

Upvotes: 3

deceze
deceze

Reputation: 522081

The keyboard has nothing to do with this. You can type Japanese with a US keyboard, for example. The keyboard just sends key codes to the OS. The OS interprets these key codes depending on which keyboard layout is selected. It may simply turn these codes into characters on the screen (which character depends on the keyboard layout you chose), or it may invoke an IME for inputting complex languages, which then in turn produces some characters on screen. These characters are so far most likely handled in UTF-16 behind the scenes, but that doesn't need to concern you at all. If you're typing into a text editor, you can then finally specify which encoding you want to save the file in. This will then be the final encoding for source code files.

Upvotes: 5

Related Questions