Reputation: 115
I want to remove the same repeated non-word character.
My code looks like this:
<?php
$name = 'Malines - Blockbuster (prod. Malines) ***** (((((( &%^$';
echo preg_replace('/[^\pL\pN\s]{2,}/u', '', $name);
?>
Malines - Blockbuster (prod. Malines) ***** (((((( &%^$
should be Malines - Blockbuster (prod. Malines) &%^$
Only repeated non-word character should be removed. Could you help me?
Upvotes: 0
Views: 101
Reputation: 89567
Using the same idea than Arlaud Agbe Pierre (that is the way to do it), the pattern can be shorten using the Xan character class. \p{Xan}
is the union of \p{L}
and \p{N}
. \P{Xan}
(with an uppercase "p") is the negation of this class (i.e. all that is not a letter or a number).
$str = 'Malines - Blockbuster (prod. Malines) ààà ***** (((((( &%^$';
echo preg_replace('~(\P{Xan})\1+~u', '$1', $str);
Note: in this pattern, consecutive white characters are removed too.
an other way:
echo preg_replace('~(\P{Xan})\K\1+~u', '', $str);
where \K
resets the begining of the match from the match result. (note that you can however define capturing groups before that you can use after)
Upvotes: 1
Reputation: 4133
You must use back-reference to say that the second character is the same as the one before :
/([^\pL\pN\s])\1+/u
Upvotes: 2