joverboard
joverboard

Reputation: 335

Scandinavian characters æ, ø, å escaped incorrectly

My program interfaces with servers in other countries and regularly needs to handle URLs containing foreign characters. This works fine until we consider Scandinavian characters such as æ, ø, and å. When I receive a URL, I decode it as follows:

-(NSString*)urlDECODE:(NSString*)string
{
    NSString*   s = [string stringByReplacingPercentEscapesUsingEncoding:NSUTF8StringEncoding];

    return (s)?s:string;
}

This fails to properly decode these characters, however:

filename: æøåa.rtf
input: %C3%83%C2%A6%C3%83%C2%B8a%C3%8C%C2%8Aa.rtf
output: æøaÌa.rtf

EDIT: This is the encoding function:

NSString * URLEncode(NSString * url)
{
    NSString* out = nil;
    @try
    {
        NSLog(@"BEFORE=%@",url);
        out = [url stringByAddingPercentEscapesUsingEncoding:NSUTF8StringEncoding];
        NSLog(@"AFTER=%@",out);
    }
    @catch (NSException * e)
    {
        NSLog(@"Encoding error: %@", e);
    }

return out;
}

Upvotes: 0

Views: 1261

Answers (1)

kennytm
kennytm

Reputation: 523274

It seems your original URL is already mistakenly encoded in UTF-8.

"æøåa.rtf" == "\xc3\xa6\xc3\xb8a\xcc\x8aa.rtf"
               == "æ"      "ø"    "a\u030a" "a.rtf"  // in UTF-8
               == "æøåa.rtf"

Please check the constructed NSString passed to URLEncode(). The other code you've shown are correct (except that it's rare to handle exceptions in Objective-C).

Upvotes: 1

Related Questions