Reputation: 123
I want to remove a set of characters from string. I am using preg_replace for replace unicode character with blank.
I have some ranges for unicode characters.
It works for following code.
$output = "Clean :this; [cnv\al?id@ non AS]CII äóchar^acters.";
$output = preg_replace('/[\x00-\x1F]|[\x21-\x2C]|[\x3A-\x40]|[\x5B-\x5E]|[\x7B-\x7D]|[\x80-\xBF]|[\x2B0-\x36F]/','', $output);
echo $output;
But it gives error for following code.
$output = "Clean :this; [cnv\al?id@ non AS]CII äóchar^acters.";
$output = preg_replace('/[\x00-\x1F]|[\x21-\x2C]|[\x3A-\x40]|[\x5B-\x5E]|[\x7B-\x7D]|[\x80-\xBF]|[\x2B0-\x36F]|[\x2000-\x2BFF]|[\x2E00-\x2E7F]|[\x3000-\x303F]|[\x1D000-\x1D24F]|[\x1F600-\x1F77F]|[\x1F000-\x1F0FF]/','', $output);
echo $output;
Error:- preg_replace(): Compilation failed: range out of order in character class at offset 97
I can use for loop for remove unicode characters from string. So I need to run loop for more range.
Can you please suggest me which is better in above code? Either for loop or preg_replace? If preg_replace is better then need solution for above error.
Upvotes: 0
Views: 2475
Reputation: 37365
Your problem is that \x
will accept two digits only, so you need to add curly brackets, like:
$output = "Clean :this; [cnv\al?id@ non AS]CII äóchar^acters.";
$output = preg_replace('/[\x00-\x1F]|[\x21-\x2C]|[\x3A-\x40]|[\x5B-\x5E]|[\x7B-\x7D]|[\x80}-\xBF]|[\x{2B0}-\x{36F}]|[\x{2000}-\x{2BFF}]|[\x{2E00}-\x{2E7F}]|[\x{3000}-\x{303F}]|[\x1{D000}-\x{1D24F}]|[\x{1F600}-\x{1F77F}]|[\x{1F000}-\x{1F0FF}]/u','', $output);
-with that you'll also need to add u
modifier to your regex.
Upvotes: 2