Reputation: 31
why pattern
[A-Z][A-z]*
return Ve
for French word Vénus
using NSRegularExpression .I want to match camel word,but this word is strange
Upvotes: 2
Views: 209
Reputation: 336448
The reason why your regex matches Ve
and not Vé
is because there are two ways to represent an é
in Unicode:
U+00E9
ore
, followed by the combining mark ´
(U+0065 U+0301
). Note that the latter is not the actual "standalone" ´
character (U+00B4
).Your string is apparently encoded using the second option. Therefore [A-z]
only matches the first half of the combined character. Since the following ´
doesn't match, the regex stops at this point. You should normalize the string first before applying a regex to it.
Furthermore, use [A-Za-z]
instead of [A-z]
. Otherwise, some non-letter characters like ^
or ]
will also be matched.
Upvotes: 2