Reputation: 1141
preg_match('/^[\p{L}\s]+$/u', 'АБВГД ENGLISH STRING', $matches);
here it matches all the characters Cyrillic and Latin, why they are not filtered?, the file encoding is utf-8, what I'am doing wrong?
Upvotes: 0
Views: 2612
Reputation: 1821
Use:
/^(?:\p{Cyrillic}+|\p{Latin}+)$/
, Do grouping to allow only one type of char.
\p{Cyrillic}
, it matches any cyrillic character..
\p{Latin}
, it matches any latin character.
if you need only English Characters in whole string:
use:
preg_match_all('/[\p{Latin}]+/u', 'АБВГД ENGLISH STRING', $matches);
print_r($matches);
It will return all english matches.
Upvotes: 5
Reputation: 423
\p{L}
in RegEx matches Unicode Category (L is the category Letter).
That's why your regex matches all letter symbols, include cyrillic.
If you want to get just latin, use \p{Latin}
for all unicode latin symbols, or a-z
to match just ASCII symbols
Upvotes: 1