ting12
ting12

Reputation: 101

MySql Getting from utf8 code point to utf16 code point and back?

Example based on the 好 Chinese character (utf8:E5A5BD, utf16:597D), MySQL 5.5.35 UTF-8 Unicode

I can get UTF-8 code point from character:

SELECT HEX('好');

=> E5A5BD

I can get UTF-16 encoded character from UTF-16 code point:

SELECT CHAR(0x597D USING utf16);

=> 好

But then how to get to the related UTF-8 code point?

And I can't figure out how to get from the UTF-8 code point back to anywhere, neither to the character nor to the UTF-16 code point.

Any suggestion?

Upvotes: 2

Views: 1784

Answers (2)

masakielastic
masakielastic

Reputation: 4640

If you want to check emoji (U+10000 and later);

// initialize character set utf8mb4
SET NAMES 'utf8mb4';

// codepoint: U+1F42C (DOLPHIN)
// UTF-32: 0x0001F42C
// UTF-16: 0xD83D 0xDC2C
// UTF-8 : 0xF0 0x9F 0x90 0xAC

// UTF-32 -> UTF-16
// result: D83DDC2C
SELECT HEX(CONVERT(CHAR(0x1F42C using utf32) using utf16));

// UTF-16 -> UTF-8
// result: F09F90AC
SELECT HEX(CONVERT(CHAR(0xD83DDC2C USING utf16) USING utf8mb4));

// UTF-8 -> UTF-32
// result: 0001F42C
SELECT HEX(CONVERT(CHAR(0xF09F90AC USING utf8mb4) USING utf32));

Upvotes: 0

Joni
Joni

Reputation: 111289

You can use the CONVERT function to encode the string in UTF-8, and then the HEX function to get the hexadecimal representation.

SELECT hex(convert(CHAR(0x597D using utf16) using utf8));

=> E5A5BD

Upvotes: 3

Related Questions