Reputation: 7038
For inverted question mark ¿
I receive two bytes [-62][-65] but how would i get readable utf-8 or ASCII character encoding?
Upvotes: 0
Views: 1594
Reputation: 881383
That is the UTF8 code for that character. The inverted question mark is Unicode code point 191
which, in UTF8, is 0xc2:0xbf
.
You're seeing them as signed bytes. For example -62
signed is 256-62
or 194
unsigned - that's hex 0xc2
.
Similarly, -65
signed is 256-65
or 191
unsigned - that's hex 0xbf
.
If you want to convert your UTF8 sequence into a code point, you can use the table below.
Range Encoding Binary value ----------------- -------- -------------------------- U+000000-U+00007f 0xxxxxxx 0xxxxxxx U+000080-U+0007ff 110yyyxx 00000yyy xxxxxxxx 10xxxxxx U+000800-U+00ffff 1110yyyy yyyyyyyy xxxxxxxx 10yyyyxx 10xxxxxx U+010000-U+10ffff 11110zzz 000zzzzz yyyyyyyy xxxxxxxx 10zzyyyy 10yyyyxx 10xxxxxx
For example, your 0xc2:0xbf
is binary 11000010 10111111
which matches the second case:
11000010 10111111 ||||| |||||| |||\\ ////// ||| |||||||| 00000000 10111111 -> 0x00bf -> 191
Upvotes: 4
Reputation: 399803
Look at the byte values in hexadecimal:
If you look up the Unicode information for the glyph in question, you can see that this is, inded, the two bytes that make up the UTF-8 encoding of the inverted question mark glyph.
Upvotes: 1
Reputation: 273229
Those 2 bytes probably are UTF-8
For ASCII you would need a specific codepage.
And what exactly is a 'readable' char encoding?
Upvotes: 1