motioz
motioz

Reputation: 662

PHP how to Remove non-language Characters from a String?

how can i to remove all characters non-language ?

i want to remove characters like this below, and all other of not language characters:



i using this:

preg_replace("/[^a-z0-9A-Z\-\'\|\!\.\?\:\)\(\;\*\"]/u", " ", $text );

this is good for english, i need to approve all language characters, like Russian,arabic,hebrew,japan...

Are there any string functions I can use to leave all language characters?

thanks

Upvotes: 1

Views: 4362

Answers (2)

Terry Lin
Terry Lin

Reputation: 2599

Tim Pietzcker's answer not working in my case.

This works.

$after = preg_replace('/[^\w\s]+/u','' , $before);

Upvotes: 1

Tim Pietzcker
Tim Pietzcker

Reputation: 336098

No regex will be perfect for what you want - language and writing are just too complex for this. But an approximation could be

preg_replace('/[^\p{L}\p{M}\p{Z}\p{N}\p{P}]/u', ' ', $text);

This will replace anything by a space that's not a Unicode character with one of the properties “letter”, “mark”, “separator”, “number” or “punctuation”.

Upvotes: 11

Related Questions