nick
nick

Reputation: 57

NSString UTF8String mangling unicode characters

When I run [NSString UTF8String] on certain unicode characters the resulting const char* representation is mangled both in NSLog and on the device/simulator display. The NSString itself displays fine but I need to convert the NSString to a cStr to use it in CGContextShowTextAtPoint.

It's very easy to reproduce (see code below) but I've searched for similar questions without any luck. Must be something basic I'm missing.

const char *cStr = [@"章" UTF8String];
NSLog(@"%s", cStr); 

Thanks!

Upvotes: 2

Views: 1364

Answers (3)

inspector-g
inspector-g

Reputation: 4176

I've never noticed this issue before, but some quick experimentation shows that using printf instead of NSLog will cause the correct Unicode character to show up.

Try:

printf("%s", cStr);

This gives me the desired output ("章") both in the Xcode console and in Terminal. As nob1984 stated in his answer, the interpretation of the character data is up to the callee.

Upvotes: 1

Thant Thet
Thant Thet

Reputation: 196

CGContextShowTextAtPoint is only for ASCII chars.

Check this SO question for answers.

Upvotes: 2

NSProgrammer
NSProgrammer

Reputation: 2396

When using the string format specifier (aka %s) you cannot be guaranteed that the characters of a c string will print correctly if they are not ASCII. Using a complex character as you've defined can be expressed in UTF-8 using escape characters to indicate the character set from which the character can be found. However the %s uses the system encoding to interpret the characters in the character string you provide to the formatting ( in this case, in NSLog ). See Apple's documentation:

https://developer.apple.com/library/mac/documentation/cocoa/Conceptual/Strings/Articles/formatSpecifiers.html

%s Null-terminated array of 8-bit unsigned characters. %s interprets its input in the system encoding rather than, for example, UTF-8.

Going onto you CGContextShowTextAtPoint not working, that API supports only the macRoman character set, which is not the entire Unicode character set.

Youll need to look into another API for showing Unicode characters. Probably Core Text is where you'll want to start.

Upvotes: 1

Related Questions