Reputation: 363
Good day!
I am having some troubles with preg_replace
and utf-8 characters. The following code-fragment:
$v = "line1\nline2\r\nмы хотели бы поблагодарить";
print $v;
print preg_replace("#\R#", "", $v);
print preg_replace("\n", "", $v);
returns the following output:
line1
line2
мы хотели бы поблагодарить
line1line2мы �отели бы поблагодарить
line1line2
мы хотели бы поблагодарить Вас
For some reason the х is unreadable when \R
is used but it is unaffected when \n
is used. As \R
is PHP specific I suppose this generates the problem. Does anybody have a clue about how I could use \R
(which is not accepted by str_replace
) in preg_replace
? I fear this problem might be happening in many other cases, not only with capital chi.
Upvotes: 3
Views: 434
Reputation: 626845
Since you have a Unicode input, you must pass /u
flag to the regex to deal with the input correctly:
$v = "line1\nline2\r\nмы хотели бы поблагодарить";
echo preg_replace('/\R/u', "", $v);
// => line1line2мы хотели бы поблагодарить
See IDEONE demo
This /u
flag is required when both pattern and input can contain Unicode string literals.
Upvotes: 5