Desmond Hume
Desmond Hume

Reputation: 8617

Is there a way to check if a string contains a Unicode letter?

In Cocoa, regular expressions are presumably following the ICU Unicode rules for character matching and the ICU standard includes character properties such as \p{L} for matching all kinds of Unicode letters. However

NSString* str = @"A";
NSPredicate* pred = [NSPredicate predicateWithFormat:@"SELF MATCHES '\\p{L}'"];
NSLog(@"%d", [pred evaluateWithObject:str]);

doesn't seem to compile:

Can't do regex matching, reason: Can't open pattern U_REGEX_BAD_INTERVAL (string A, pattern p{L}, case 0, canon 0)

If character properties are not supported (are they?), how else could I check if a string contains a Unicode letter in my iOS app?

Upvotes: 2

Views: 951

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627101

The main point here is that MATCHES requires a full string match, and also, \ backslash passed to the regex engine should be a literal backslash.

The regex can thus be

(?s).*\p{L}.*

Which means:

  • (?s) - enable DOTALL mode
  • .* - match 0 or more any characters
  • \p{L} - match a Unicode letter
  • .* - match zero or more characters.

In iOS, just double the backslashes:

NSPredicate * predicat = [NSPredicate predicateWithFormat:@"SELF MATCHES '(?s).*\\p{L}.*'"];

See IDEONE demo

If the backslashes inside the NSPrediciate are treated specifically, use:

NSPredicate * predicat = [NSPredicate predicateWithFormat:@"SELF MATCHES '(?s).*\\\\p{L}.*'"];

Upvotes: 2

Related Questions