Reputation: 101
Example based on the 好 Chinese character (utf8:E5A5BD, utf16:597D), MySQL 5.5.35 UTF-8 Unicode
I can get UTF-8 code point from character:
SELECT HEX('好');
=> E5A5BD
I can get UTF-16 encoded character from UTF-16 code point:
SELECT CHAR(0x597D USING utf16);
=> 好
But then how to get to the related UTF-8 code point?
And I can't figure out how to get from the UTF-8 code point back to anywhere, neither to the character nor to the UTF-16 code point.
Any suggestion?
Upvotes: 2
Views: 1784
Reputation: 4640
If you want to check emoji (U+10000 and later);
// initialize character set utf8mb4
SET NAMES 'utf8mb4';
// codepoint: U+1F42C (DOLPHIN)
// UTF-32: 0x0001F42C
// UTF-16: 0xD83D 0xDC2C
// UTF-8 : 0xF0 0x9F 0x90 0xAC
// UTF-32 -> UTF-16
// result: D83DDC2C
SELECT HEX(CONVERT(CHAR(0x1F42C using utf32) using utf16));
// UTF-16 -> UTF-8
// result: F09F90AC
SELECT HEX(CONVERT(CHAR(0xD83DDC2C USING utf16) USING utf8mb4));
// UTF-8 -> UTF-32
// result: 0001F42C
SELECT HEX(CONVERT(CHAR(0xF09F90AC USING utf8mb4) USING utf32));
Upvotes: 0