Reputation: 28248
I have an app that syncs data from a remote DB that users populate. Seems people copy and paste crap from a ton of different OS's and programs which can cause different hidden non ASCII values to be imported into the system.
For example I end up with this:
Artist:â â Ioco
This ends up getting sent back into system during sync and my JSON conversion furthers the problem and invalid characters in various places cause my app to crash.
How do I search for and clean out any of these invalid characters?
Upvotes: 6
Views: 10835
Reputation: 22948
A simpler version of Morten Fast's answer:
NSString *test = @"Olé, señor!";
NSCharacterSet *nonAsciiCharacterSet = [[NSCharacterSet
characterSetWithRange:NSMakeRange(32, 127 - 32)] invertedSet];
test = [[test componentsSeparatedByCharactersInSet:nonAsciiCharacterSet]
componentsJoinedByString:@""];
NSLog(@"%@", test); // Prints @"Ol, seor!"
Notably, this uses NSCharacterSet
's +characterSetWithRange:
method to simply specify the desired ASCII range rather than having to create a string, etc.
The results are identical, as comparing one to the other with isEqual:
returns YES
.
Upvotes: 1
Reputation: 6320
While I strongly believe that supporting unicode is the right way to go, here's an example of how you can limit a string to only contain certain characters (in this case ASCII):
NSString *test = @"Olé, señor!";
NSMutableString *asciiCharacters = [NSMutableString string];
for (NSInteger i = 32; i < 127; i++) {
[asciiCharacters appendFormat:@"%c", i];
}
NSCharacterSet *nonAsciiCharacterSet = [[NSCharacterSet characterSetWithCharactersInString:asciiCharacters] invertedSet];
test = [[test componentsSeparatedByCharactersInSet:nonAsciiCharacterSet] componentsJoinedByString:@""];
NSLog(@"%@", test); // Prints @"Ol, seor!"
Upvotes: 22