Karolis
Karolis

Reputation: 2618

Combine regex character classes

Regex pattern /[^[:ascii:]]+/ui will match one or more non-ascii characters.

Regex pattern /[\p{L}]+/ui will match one or more characters in unicode 'letter' class.

I can't figure out a way how to match one or more characters that are in unicode 'letter' class AND are not ascii characters.

Upvotes: 2

Views: 500

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626929

You can use

[^\P{L}A-Za-z]+

It matches any Unicode letter that is not equal to ASCII letter.

See the regex demo.

In PHP, you should use the u flag to make it work correctly with Unicode strings:

$regex = '/[^\P{L}A-Za-z]+/u';

Upvotes: 1

anubhava
anubhava

Reputation: 785256

You can use a negated character class like this:

[^\P{L}[:ascii:]]+

RegEx Demo 1

This will match 1+ of any character that is not an ASCII and not matched by \P{L} (inverse of \p{L})


Alternatively, you can use negative lookahead in a non-capture group:

(?:(?![[:ascii:]])\p{L})+

RegEx Demo 2

Upvotes: 2

Related Questions