Reputation: 97
I'm using Foursquare's API to retrieve some attraction names. The problem is, for certain cities (like Cairo, Moscow, Beijing) the English name of the attraction is appended to the name in the language of the country, so for example an attraction in Cairo will look like this:
Wekalet Al-Ghouri Arts Center | وكالة السلطان الغوري
For each attraction I use Flickr's API to find a photo where the name is used in the query. However, there are almost no results for the string above while just querying 'Wekalet Al-Ghouri Arts Centre' gives a lot of results. So my question is, is there a way of identifying and removing non-english characters from a string? Thanks for any help in advance :)
Upvotes: 0
Views: 745
Reputation: 2052
My hacky solution:
NSString *stringWithForeignCharacters = @"Wekalet Al-Ghouri Arts Center | وكالة السلطان الغوري";
NSMutableCharacterSet *englishCharacterSet = [NSMutableCharacterSet characterSetWithCharactersInString:@"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-+ "];
// Add other such character sets as needed
[englishCharacterSet formUnionWithCharacterSet:[NSCharacterSet symbolCharacterSet]];
NSCharacterSet *foreignCharacters= [englishCharacterSet invertedSet];
NSString *filteredString= [[stringWithForeignCharacters componentsSeparatedByCharactersInSet:foreignCharacters] componentsJoinedByString:@""];
Warning: This might be slow for complex strings.
Upvotes: 2
Reputation: 16660
Assuming that you want to have only the ASCII character set (changing this is very easy in below code) you can do this
NSString *source = …;
NSMutableString *dest = [source mutableCopy];
NSCharacterSet *validCharacters = [NSCharacterSet characterSetWithCharactersInString:@" -+abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"];
NSCharacterSet *invalidCharacters = [validCharacters invertedSet];
NSRange invalidRange;
while ( (invalidRange = [dest rangeOfCharactersFromSet:invalidCharacters]).length != 0)
{
[dest replaceCharactersInRange:invalidRange withString:@""];
}
Typed n Safari. }
Upvotes: 1