Anthony C
Anthony C

Reputation: 1990

How do I properly encode Unicode characters in my NSString?

Problem Statement

I create a number of strings, concatenate them together into CSV format, and then email the string as an attachment.

When these strings contain only ASCII characters, the CSV file is built and emailed properly. When I include non-ASCII characters, the result string becomes malformed and the CSV file is not created properly. (The email view shows an attachment, but it is not sent.)

For instance, this works:

uncle bill's house of pancakes

But this doesn't (note the curly apostrophe):

uncle bill’s house of pancakes

Question

How do I create and encode the final string properly so that all valid unicode characters are included and the result string is formed properly?

Notes

String Processing Code

I concatenate my strings together like this:

[reportString appendFormat:@"%@,", category];
[reportString appendFormat:@"%@,", client];
[reportString appendFormat:@"%@\n", detail];
etc.

Replacing curly quotes with boring quotes makes it work, but I don't want to do it this way:

- (NSMutableString *)cleanString:(NSString *)activity {
    NSString *temp1 = [activity stringByReplacingOccurrencesOfString:@"’" withString:@"'"];
    NSString *temp2 = [temp1 stringByReplacingOccurrencesOfString:@"‘" withString:@"'"];
    NSString *temp3 = [temp2 stringByReplacingOccurrencesOfString:@"”" withString:@"\""];
    NSString *temp4 = [temp3 stringByReplacingOccurrencesOfString:@"“" withString:@"\""];
    return [NSMutableString temp4];
}

Edit: The email is sent:

    NSString *attachment = [self formatReportCSV];
    [picker addAttachmentData:[attachment dataUsingEncoding:NSStringEncodingConversionAllowLossy] mimeType:nil fileName:@"MyCSVFile.csv"];

where formatReportCSV is the function that concatenates and returns the csv string.

Upvotes: 2

Views: 4993

Answers (1)

David Doyle
David Doyle

Reputation: 1716

You seem to be running across a string encoding issue. Without seeing what your Core Data model looks like, I'd assume the issue boils down to the issue reproduced by the code below.

NSString *string1 = @"Uncle bill’s house of pancakes.";
NSString *string2 = @" Appended with some garbage's stuff.";
NSMutableString *mutableString = [NSMutableString stringWithString: string1];
[mutableString appendString: string2];
NSLog(@"We got: %@", mutableString);
// We got: Uncle bill’s house of pancakes. Appended with some garbage's stuff.

NSData *storedVersion = [mutableString dataUsingEncoding: NSStringEncodingConversionAllowLossy];
NSString *restoredString = [[NSString alloc] initWithData: storedVersion encoding: NSStringEncodingConversionAllowLossy];
NSLog(@"Restored string with NSStringEncodingConversionAllowLossy: %@", restoredString);
// Restored string with NSStringEncodingConversionAllowLossy: 

storedVersion = [mutableString dataUsingEncoding: NSUTF8StringEncoding];
restoredString = [[NSString alloc] initWithData: storedVersion encoding: NSUTF8StringEncoding];
NSLog(@"Restored string with UTF8: %@", restoredString);
// Restored string with UTF8: Uncle bill’s house of pancakes. Appended with some garbage's stuff.

Note how the first string (encoded using ASCII) couldn't handle the presence of the non-ASCII character (it can if you use dataUsingEncoding:allowsLossyConversion: with the second parameter being YES).

This code should fix the issue:

NSString *attachment = [self formatReportCSV];
[picker addAttachmentData:[attachment dataUsingEncoding: NSUTF8StringEncoding] mimeType:nil fileName:@"MyCSVFile.csv"];

Note: you may need to use one of the UTF16 string encodings if you need to handle non-UTF8 languages like Japanese.

Upvotes: 4

Related Questions