user1804599
user1804599

Reputation:

How to decode UTF-8 without having illegal input replaced by a replacement character?

icu::UnicodeString::fromUTF8 replaces illegal input with U+FFFD. Is there a way to detect whether it has done this, so that I can throw an exception?

Upvotes: 0

Views: 565

Answers (1)

Roddy
Roddy

Reputation: 68074

Use u_strFromUTF8

UChar* u_strFromUTF8    (   UChar *     dest,
int32_t     destCapacity,
int32_t *   pDestLength,
const char *    src,
int32_t     srcLength,
UErrorCode *    pErrorCode 
)       
Convert a UTF-8 string to UTF-16.

If the input string is not well-formed, then the U_INVALID_CHAR_FOUND error code is set

http://icu-project.org/apiref/icu4c/ustring_8h.html#a5f9ff224b11166a106d1b3ac26454cd4

Upvotes: 2

Related Questions