Reputation: 103
I try to filter a variable allowing alphanumeric ,spaces ,accented characters , and single quotes and replace the reste by a space , so a string like :
substitué à une otage % ? vendredi 23 mars lors de l’attaque
should output :
substitué à une otage vendredi 23 mars lors de l’attaque
but I get as Result the output :
substitué à une otage vendredi 23 mars lors de l
could please help , this is my code
$whitelist = "/[^a-zA-Z0-9а-àâáçéèèêëìîíïôòóùûüÂÊÎÔúÛÄËÏÖÜÀÆæÇÉÈŒœÙñý',. ]/";
$descreption = preg_replace($whitelist, ' ', $ds);
}else{
$errors = self::DESCREPTION_ERROR;
return false;
}
Upvotes: 0
Views: 1797
Reputation: 16373
You may have a look at Unicode character properties.
Summary of my changes:
\p{L}
to match all letters\-
)'
) and typographic (’
) apostrophesHere is the result:
$whitelist = '/[^\p{L}0-9\-\'’,. ]/u';
There is probably room for even further improvement. Finally, don't forget to add the u
modifier!
Upvotes: 1
Reputation: 147146
One way to deal with the range of accented characters is to use the POSIX [:alnum:]
class, which in PHP in conjunction with the u
modifier will match all of them. That can then be put into a negated character class with the other characters you want to keep to allow the other characters to be removed:
$string = 'substitué à une otage % ? vendredi 23 mars lors de l’attaque';
echo preg_replace("/[^[:alnum:]'’,.]/u", ' ', $string);
Output:
substitué à une otage vendredi 23 mars lors de l’attaque
As has been pointed out in the comments, ’
is not the same as '
and so it also needs to be added to the set of characters you want to keep.
Upvotes: 1
Reputation: 6732
Your regex is faulty. The part а-à
gives the error Character range is out of order
- I guess the -
was added by mistake there...
Then a small hint: ’
is not '
[^a-zA-Z0-9àâáçéèèêëìîíïôòóùûüÂÊÎÔúÛÄËÏÖÜÀÆæÇÉÈŒœÙñý'’,. ]
should work fine.
Also, if you're working with Regex, tools like RegExr or regex101 are really a nice thing.
Upvotes: 3