Reputation: 18010
Following code:
$string ='۱۲۳۴۵۶۷۸۹۰';
$regex ='@۱@';
preg_match_all($regex,$string,$match);
var_dump($match);
will output:
array(1) {
[0] =>
array(1) {
[0] =>
string(2) "۱"
}
}
but
$regex2 ='@[۱]@';
preg_match_all($regex2,$string,$match);
var_dump($match);
will output
array (size=1)
0 =>
array (size=11)
0 => string '�' (length=1)
1 => string '�' (length=1)
2 => string '�' (length=1)
3 => string '�' (length=1)
4 => string '�' (length=1)
5 => string '�' (length=1)
6 => string '�' (length=1)
7 => string '�' (length=1)
8 => string '�' (length=1)
9 => string '�' (length=1)
10 => string '�' (length=1)
Indeed I want use RegEx like [۱۲۳۴۵۶۷۸۹۰]
, but the function output strange result with such RegEx's. I am using PHP 5.4
Upvotes: 0
Views: 54
Reputation: 324630
Try adding the Unicode flag:
$regex = '@[۱]@u';
The reason for this is because ۱
is actually several bytes long. On it's own, it's harmless because those exact bytes are either the symbol, or the individual bytes being there coincidentally. However, in a character class any of the individual bytes may match any of the individual bytes in the other characters, which is does because they are close together in the map.
Upvotes: 2