Reputation: 2618
Regex pattern /[^[:ascii:]]+/ui
will match one or more non-ascii characters.
Regex pattern /[\p{L}]+/ui
will match one or more characters in unicode 'letter' class.
I can't figure out a way how to match one or more characters that are in unicode 'letter' class AND are not ascii characters.
Upvotes: 2
Views: 500
Reputation: 626929
You can use
[^\P{L}A-Za-z]+
It matches any Unicode letter that is not equal to ASCII letter.
See the regex demo.
In PHP, you should use the u
flag to make it work correctly with Unicode strings:
$regex = '/[^\P{L}A-Za-z]+/u';
Upvotes: 1
Reputation: 785256
You can use a negated character class like this:
[^\P{L}[:ascii:]]+
This will match 1+ of any character that is not an ASCII and not matched by \P{L}
(inverse of \p{L}
)
Alternatively, you can use negative lookahead in a non-capture group:
(?:(?![[:ascii:]])\p{L})+
Upvotes: 2