Reputation: 2295
I want to validate a string where each letter should be an Arabic or English letter or one of the symbols \-.ـ
or a space.
The first regix I came with was
/^([\u0600-\u06ff\u0750-\u077f\ufb50-\ufc3f\ufe70-\ufefca-zA-Z\- .ـ]+)$/
Which worked fine with JS
but not with pcre(php)
validation.So I tried another solution to avoid \u
in the validation.
/^[\p{Arabic}a-zA-Z\- .ـ]+$/
This regex gave me no error and worked exactly as I need
But PHP
didn't, I tested the same text in php
if ( preg_match('/^[\p{Arabic}a-zA-Z\- .ـ]+$/', "engعربlisي هنا.hـ") )
die("T");
else
die("F");
The result of the code was F
and not T
, Why is that?
Upvotes: 1
Views: 2028
Reputation: 627607
The Unicode block by itself in a PHP regex is not enough to match Unicode strings.
You need a /u
modifier to actually force PHP to use Unicode matching.
u (PCRE_UTF8)
This modifier turns on additional functionality of PCRE that is incompatible with Perl. Pattern and subject strings are treated as UTF-8. This modifier is available from PHP 4.1.0 or greater on Unix and from PHP 4.2.3 on win32. UTF-8 validity of the pattern and the subject is checked since PHP 4.3.5. An invalid subject will cause thepreg_*
function to match nothing; an invalid pattern will trigger an error of level E_WARNING. Five and six octet UTF-8 sequences are regarded as invalid since PHP 5.3.4 (resp. PCRE 7.3 2007-08-28); formerly those have been regarded as valid UTF-8.
Thus:
if ( preg_match('/^[\p{Arabic}a-zA-Z\- .ـ]+$/u', "engعربlisي هنا.hـ") )
// ^^
die("T");
else
die("F");
Outputs T
.
See IDEONE demo
Upvotes: 1