preg replace remove the same repeated non-word character

Question

I want to remove the same repeated non-word character.

My code looks like this:

Malines - Blockbuster (prod. Malines) ***** (((((( &%^$ should be Malines - Blockbuster (prod. Malines) &%^$

Only repeated non-word character should be removed. Could you help me?

Casimir et Hippolyte · Accepted Answer

Using the same idea than Arlaud Agbe Pierre (that is the way to do it), the pattern can be shorten using the Xan character class. \p{Xan} is the union of \p{L} and \p{N}. \P{Xan} (with an uppercase "p") is the negation of this class (i.e. all that is not a letter or a number).

$str = 'Malines - Blockbuster (prod. Malines) ààà ***** (((((( &%^$';

echo preg_replace('~(\P{Xan})\1+~u', '$1', $str);

Note: in this pattern, consecutive white characters are removed too.

an other way:

echo preg_replace('~(\P{Xan})\K\1+~u', '', $str);

where \K resets the begining of the match from the match result. (note that you can however define capturing groups before that you can use after)

preg replace remove the same repeated non-word character

Answers (2)

Related Questions