Vishnu
Vishnu

Reputation: 2452

Remove all symbols except Dot and also remove everything inside braces

I creating seo friendly url permalinks , I have words like below which contains extra spaces , symbols etc.

INPUT examples:

software version 1.2.33 !##$%@~_+:";,|}{[];,;#&*^{2014}
Вася Обломов - Многоходовочка! (2014) MP3 [bitsnoop]
Дельфин - Андрей $$ (2014) MP3 [bitsnoop]
Laidback Luke & Uberjak'd – Go (Original Mix) [Hysteria] [bitsnoop]
Bob Dylan - Down In The Groove [320k MP3] [bitsnoop]

Desired OUTPUT:

software version 1.2.33
Вася Обломов Многоходовочка MP3
Дельфин Андрей MP3
Laidback Luke Uberjakd Go
Bob Dylan Down In The Groove

What i Tried :

$string = "ABC (Test1) hello$";
$string = preg_replace("/\([^)]+\)/","",$string); // 'ABC hello$'
$string = preg_replace("/[^ \w]+/", "", $string);

So in simple words i need to remove everthing inside brackets like {}[]() ,and remove all symbols except . (dot) .

P.s : this contains utf8 encoded strings aswel

Upvotes: 0

Views: 134

Answers (1)

Avinash Raj
Avinash Raj

Reputation: 174776

Use the below regex and then replace the matched characters with empty string.

 *(?:\{[^}]*\}|\[[^\]]*\]|\([^)]*\)|[^\p{L}\p{N}\s.])

DEMO

Code:

$string = <<<EOT
software version 1.2.33 !##$%@~_+:";,|}{[];,;#&*^{2014}
Вася Обломов - Многоходовочка! (2014) MP3 [bitsnoop]
Дельфин - Андрей $$ (2014) MP3 [bitsnoop]
Laidback Luke & Uberjak\'d – Go (Original Mix) [Hysteria] [bitsnoop]
Bob Dylan - Down In The Groove [320k MP3] [bitsnoop]
EOT;
echo preg_replace('~ *(?:\{[^}]*\}|\[[^\]]*\]|\([^)]*\)|[^\p{L}\p{N}\s.])~u', '', $string)

Output:

software version 1.2.33
Вася Обломов Многоходовочка MP3
Дельфин Андрей MP3
Laidback Luke Uberjakd Go
Bob Dylan Down In The Groove

\p{L} matches any kind of letter from any language and \p{N} matches any kind of number.

Upvotes: 3

Related Questions