malteriechmann
malteriechmann

Reputation: 226

How to convert from NSString to const char* with umlauts in Objective-C?

I have a problem with umlauts in a NSString converting to const char*.

This method parses a textfile of words (line by line), saves the words as strings in NSArray *results. Then convert to const char tmpConstChars. This const char saves, for example, an 'ä' like '√§'. How to convert from NSString to const char * - I Thought this is correct.

- (void)inputWordsByFile:(NSString *)path
{

    NSError *error = [[NSError alloc] init];
    NSString *content = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:&error];
    NSArray *results = [content componentsSeparatedByString:@"\n"];

    NSMutableArray *words = [[NSMutableArray alloc] initWithArray:results];
    [words removeLastObject];
    for(int i=0; i<[words count]; i++){

    const char *tmpConstChars = [[words objectAtIndex:i] UTF8String];
    [self addWordToTree:tmpConstChars];

    }
}

Upvotes: 1

Views: 2908

Answers (1)

zoul
zoul

Reputation: 104065

Unless I am mistaken, the UTF8String method returns the UTF-8 encoding bytes for the string. For zählen, these are:

$ perl -MEncode -Mutf8 -E 'say join ", ", map ord, split //, encode("utf8", "zählen")'
122, 195, 164, 104, 108, 101, 110

…where <195, 164> is the UTF-8 encoding sequence for ä. Thus, when you poke into tmpChars+2, you get the character with ASCII code 164 back. Which is probably not what you want. Aren’t you more after unichars? There’s a characterAtIndex: method that returns those, albeit one after one:

NSString *test = @"zählen";
unichar c = [test characterAtIndex:1];
NSLog(@"---> %C", c); // ---> ä

Upvotes: 2

Related Questions