Justin
Justin

Reputation: 35

Regex Not Working For Some String

These two strings seem to be the same. Why would regex match one but not another?

$str1 = "NЕТ";
$str2 = "NET";
if (preg_match("/NET/",$str1)){
    echo "Match string 1!";
}
else {
    echo "Does not match string 1!";
}
if (preg_match("/NET/",$str2)){
    echo "Match string 2!";
}
else {
    echo "Does not match string 2!";
}

Output:

Does not match string 1!Match string 2!

Upvotes: 1

Views: 80

Answers (1)

nickb
nickb

Reputation: 59699

Spoiler alert: $str1 and $str2 are NOT identical.

It's because the characters, while they look the same, are actually different:

$str1 = "NЕТ"; echo bin2hex($str1), "\n";
$str2 = "NET"; echo bin2hex($str2), "\n";

Outputs:

4ed095d0a2
4e4554

Indeed, if you print out all of the names of the characters in each string along with their Unicode code points, you'll get the first block for $str1 and the second block for $str2.

78 LATIN CAPITAL LETTER N
1045 CYRILLIC CAPITAL LETTER IE
1058 CYRILLIC CAPITAL LETTER TE

78 LATIN CAPITAL LETTER N
69 LATIN CAPITAL LETTER E
84 LATIN CAPITAL LETTER T

Upvotes: 3

Related Questions