Reputation: 123
I have regex for detecting Cyrillic First, Middle and Last names.
([А-Я][а-я]+\s+[А-Я][а-я]+[.|\s|][А-Я][а-я]+[.|\s|])
Using:
preg_match_all('/([А-Я][а-я]+(\\s|.|[ ])[А-Я][а-я]+(\\s|.|[ ])[А-Я][а-я]+)/','it\'s a test string with a name like Васильців Василь Васильович and Петро Петрович Петренко смисми ВВ Аммм Мммм Аааааа',$ar);
The results:
Array
(
[0] => Array
(
[0] => �асил�
[1] => �асил�
[2] => �асильови�
[3] => енко
[4] => мисми
[5] => �ааааа
)
[1] => Array
(
[0] => �асил�
[1] => �асил�
[2] => �асильови�
[3] => енко
[4] => мисми
[5] => �ааааа
)
[2] => Array
(
[0] => �
[1] => �
[2] => �
[3] => �
[4] => �
[5] => �
)
[3] => Array
(
[0] => �
[1] => �
[2] => �
[3] => �
[4] => �
[5] => �
)
)
It's working fine at https://regex101.com/r/xA6vX0/1 but does not work in PHP (it's detecting wrong text parts). Can you explain what's wrong or prompt me to a better online service?
Upvotes: 2
Views: 158
Reputation: 627100
I have just tested on PHP v.5.5.18 - u
option works well:
preg_match_all('/([А-ЯЁ][ёа-я]+(?:[\\s.][ЁА-Я][ёа-я]+){2})/u','it\'s a test string with a name like Васильців Василь Васильович and Петро Петрович Петренко смисми ВВ Аммм Мммм Аааааа',$ar);
print_r($ar);
Also, I contracted the spaces part with a period that was unescaped, and the pattern itself.
Output:
Array
(
[0] => Array
(
[0] => Петро Петрович Петренко
[1] => Аммм Мммм Аааааа
)
[1] => Array
(
[0] => Петро Петрович Петренко
[1] => Аммм Мммм Аааааа
)
)
Upvotes: 1