RodMatveev
RodMatveev

Reputation: 97

Remove all non-english characters from NSString

I'm using Foursquare's API to retrieve some attraction names. The problem is, for certain cities (like Cairo, Moscow, Beijing) the English name of the attraction is appended to the name in the language of the country, so for example an attraction in Cairo will look like this:

Wekalet Al-Ghouri Arts Center | وكالة السلطان الغوري

For each attraction I use Flickr's API to find a photo where the name is used in the query. However, there are almost no results for the string above while just querying 'Wekalet Al-Ghouri Arts Centre' gives a lot of results. So my question is, is there a way of identifying and removing non-english characters from a string? Thanks for any help in advance :)

Upvotes: 0

Views: 745

Answers (2)

lead_the_zeppelin
lead_the_zeppelin

Reputation: 2052

My hacky solution:

NSString *stringWithForeignCharacters = @"Wekalet Al-Ghouri Arts Center | وكالة السلطان الغوري";
NSMutableCharacterSet *englishCharacterSet = [NSMutableCharacterSet characterSetWithCharactersInString:@"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-+ "];
// Add other such character sets as needed
[englishCharacterSet formUnionWithCharacterSet:[NSCharacterSet symbolCharacterSet]];
NSCharacterSet *foreignCharacters= [englishCharacterSet invertedSet];
NSString *filteredString= [[stringWithForeignCharacters componentsSeparatedByCharactersInSet:foreignCharacters] componentsJoinedByString:@""];

Warning: This might be slow for complex strings.

Upvotes: 2

Amin Negm-Awad
Amin Negm-Awad

Reputation: 16660

Assuming that you want to have only the ASCII character set (changing this is very easy in below code) you can do this

NSString *source = …;
NSMutableString *dest = [source mutableCopy];

NSCharacterSet *validCharacters = [NSCharacterSet characterSetWithCharactersInString:@" -+abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"];
NSCharacterSet *invalidCharacters = [validCharacters invertedSet];

NSRange invalidRange;
while ( (invalidRange = [dest rangeOfCharactersFromSet:invalidCharacters]).length != 0)
{
   [dest replaceCharactersInRange:invalidRange withString:@""];
}

Typed n Safari. }

Upvotes: 1

Related Questions