Reputation: 77
I've been trying for a few hours to get this to work to the effect I need but nothing works quite like it should. I'm building a discussion board type thing and have made a way to tag other users by putting @username
in the post text.
Currently I have this code to strip anything that wouldn't be part of the username once the tags have already been pulled out of the entire text:
$name= preg_replace("/[^A-Za-z0-9_]/",'',$name);
This works well because it correct captures names that are for example (@username)
, @username:
, @username, some text
etc. (so to remove the ,
, :
, and )
).
HOWEVER, this does not work when the user has non-ascii characters in their username. For example if it's @üsername
, the result of that line above gives sername
which is not useful.
IS there a way using preg_replace to still strip these additional punctuation, but retain any non-ascii letters?
Any help is much appreciated :)
Upvotes: 1
Views: 853
Reputation: 784958
To detect punctuation characters, you can use unicode property \p{P}
instead:
$name = preg_replace('/[\p{P} ]+/', '', $name);
Upvotes: 1
Reputation: 121000
You enter the area of Unicode Regexps.
$name= preg_replace('/[^\p{Letter}\p{Number}_]/u', '', $name);
or the other way round. The link I provided contains more examples.
Upvotes: 4