yeah its me
yeah its me

Reputation: 1141

preg_match not distinguish Latin and Cyrillic characters

preg_match('/^[\p{L}\s]+$/u', 'АБВГД ENGLISH STRING', $matches);

here it matches all the characters Cyrillic and Latin, why they are not filtered?, the file encoding is utf-8, what I'am doing wrong?

Upvotes: 0

Views: 2612

Answers (2)

kailash19
kailash19

Reputation: 1821

Use:

/^(?:\p{Cyrillic}+|\p{Latin}+)$/, Do grouping to allow only one type of char.

\p{Cyrillic}, it matches any cyrillic character..

\p{Latin}, it matches any latin character.

if you need only English Characters in whole string:

use:

preg_match_all('/[\p{Latin}]+/u', 'АБВГД ENGLISH STRING', $matches);
print_r($matches);

It will return all english matches.

Upvotes: 5

Rowman Pirce
Rowman Pirce

Reputation: 423

\p{L} in RegEx matches Unicode Category (L is the category Letter). That's why your regex matches all letter symbols, include cyrillic.

If you want to get just latin, use \p{Latin} for all unicode latin symbols, or a-z to match just ASCII symbols

Upvotes: 1

Related Questions