Reputation: 226
I have a problem with umlauts in a NSString converting to const char*.
This method parses a textfile of words (line by line), saves the words as strings in NSArray *results. Then convert to const char tmpConstChars. This const char saves, for example, an 'ä' like '√§'. How to convert from NSString to const char * - I Thought this is correct.
- (void)inputWordsByFile:(NSString *)path
{
NSError *error = [[NSError alloc] init];
NSString *content = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:&error];
NSArray *results = [content componentsSeparatedByString:@"\n"];
NSMutableArray *words = [[NSMutableArray alloc] initWithArray:results];
[words removeLastObject];
for(int i=0; i<[words count]; i++){
const char *tmpConstChars = [[words objectAtIndex:i] UTF8String];
[self addWordToTree:tmpConstChars];
}
}
Upvotes: 1
Views: 2908
Reputation: 104065
Unless I am mistaken, the UTF8String
method returns the UTF-8 encoding bytes for the string. For zählen, these are:
$ perl -MEncode -Mutf8 -E 'say join ", ", map ord, split //, encode("utf8", "zählen")'
122, 195, 164, 104, 108, 101, 110
…where <195, 164> is the UTF-8 encoding sequence for ä
. Thus, when you poke into tmpChars+2
, you get the character with ASCII code 164 back. Which is probably not what you want. Aren’t you more after unichar
s? There’s a characterAtIndex:
method that returns those, albeit one after one:
NSString *test = @"zählen";
unichar c = [test characterAtIndex:1];
NSLog(@"---> %C", c); // ---> ä
Upvotes: 2