Reputation: 68907
I have this string:
std::string str = "presents";
And when I iterate over the characters, they come in this order:
spresent
So, the last char comes first.
This is the code:
uint16_t c;
printf("%s: ", str.c_str());
for (unsigned int i = 0; i < str.size(); i += extractUTF8_Char(str, i, &c)) {
printf("%c", c);
}
printf("\n");
And this is the exctract method:
uint8_t extractUTF8_Char(string line, int offset, uint16_t *target) {
uint8_t ch = uint8_t(line.at(offset));
if ((ch & 0xC0) == 0xC0) {
if (!target) {
return 2;
}
uint8_t ch2 = uint8_t(line.at(offset + 1));
uint16_t fullCh = (uint16_t(((ch & 0x1F) >> 2)) << 8) | ((ch & 0x3) << 0x6) | (ch2 & 0x3F);
*target = fullCh;
return 2;
}
if (target) {
*target = ch;
}
return 1;
}
This method returns the length of the character. So: 1 byte or 2 bytes. And if the length is 2 bytes, it extracts the UNICODE point out of the UTF8 string.
Upvotes: 1
Views: 1528
Reputation: 4871
your first printf
is printing nonsense (the initial value of c
). The last c
gotten is not printed.
This is because the call to extractUTF8_char
is occurring in the last clause of the for
statement. You might want to change it to
for (unsigned int i = 0; i < str.size();) {
i += extractUTF8_Char(str, i, &c);
printf("%c", c);
}
instead.
Upvotes: 17