Reputation: 127
I'm trying to add a special string '|||' after newlines, blankspaces and other characters. I'm doing this because I want to split my text into an array. So I was thinking to do it like this:
$result = preg_replace("/<br>/", "<br>|||", preg_replace("/\s/", " |||", preg_replace("/\r/", "\r|||", preg_replace("/\n/", "\n|||", preg_replace("/’/", "’|||", preg_replace("/'/", "'|||", $text))))));
$result = preg_split("/[|||]+/", $result);
It works with every word but words which contain à char. It is replaced by �. I'm sure the problem is here because my string $text shows the char à.
Upvotes: 1
Views: 164
Reputation: 627607
Since your pattern deals with a Unicode string, pass the /u
modifier.
Also, you do not need so many chained regex replacements, group the first patterns and use a backreference in the replacement.
Use
preg_replace("/(<br>|[\s’'])/u", "$1|||", $text)
Note that \s
matches spaces, carriage returns and newlines.
Details:
(<br>|[\s’'])
- Group 1 capturing either a
<br>
- character sequence |
- or[\s’']
- a whitespace, ’
or '
.See the PHP demo:
$text = "Voilà. C'est vrai.";
echo preg_replace("/(<br>|[\s’'])/u", "$1|||", $text);
Upvotes: 1