Real Dreams
Real Dreams

Reputation: 18030

PHP crashes on preg_replace

I ran the following script using php.exe:

preg_replace('#(?:^[^\pL]*)|(?:[^\pL]*$)#u','',$string);

or its equivalent:

preg_replace('#(?:^[^\pL]*|[^\pL]*$)#u','',$string);

If $string="S" or $string=" ذذ " it works, if string='ذ' it yields that is incorrect , and if string='ذذ' PHP crashes.

But it works in 4.4.0 - 4.4.9, 5.0.5 - 5.1.6 versions.

What is wrong ?

See: http://3v4l.org/T3rpV


<?php
$string='دد';
echo preg_replace('#(?:^[^\pL]*)|(?:[^\pL]*$)#u','',$string);

Output for 5.4.0 - 5.5.0alpha6

Process exited with code 139.

Output for 5.2.0 - 5.3.22, 5.5.0beta1

 

Output for 4.4.0 - 4.4.9, 5.0.5 - 5.1.6

دد 

Output for 4.3.11, 5.0.0 - 5.0.4

Warning: preg_replace(): Compilation failed: PCRE does not support \L, \l, \N, \P, \p, \U, \u, or \X at offset 7 in /in/T3rpV on line 3 

Output for 4.3.0 - 4.3.10

Warning: Compilation failed: PCRE does not support \L, \l, \N, \P, \p, \U, \u, or \X at offset 7 in /in/T3rpV on line 3

Upvotes: 20

Views: 2486

Answers (5)

Real Dreams
Real Dreams

Reputation: 18030

Lastly, the bug was solved:

Output for 4.4.0 - 4.4.9, 5.0.5 - 5.1.6, 5.5.27 - 5.5.33, 5.6.11 - 7.0.4, hhvm-3.6.1 - 3.12.0
    دد

Upvotes: 0

Ja͢ck
Ja͢ck

Reputation: 173652

From looking at the expression itself, there are two things that could be improved:

  1. The * multipliers aren't very useful; why would you want to replace a potentially empty match with an empty string? In fact, running this on my system yields NULL from the preg_replace() operation.

  2. The memory groups can be merged together.

This is the code after applying both improvements:

$string = 'ﺫﺫ';
var_dump(preg_replace('#(?:^[^\pL]+|[^\pL]+$)#u', '', $string));
// string(4) "ﺫﺫ"

3v4l results

If you're just looking for a multibyte trim function (supported from 4.3.0 onwards):

$string=' دد';
var_dump(preg_replace('#(?:^\s+|\s+$)#u', '', $string));

3v4l results

Upvotes: 1

Ardy Dedase
Ardy Dedase

Reputation: 1088

Use preg_quote and you have to properly escape the special character before using it with your regex. For example:

<?php
$string = preg_quote("\دد");
echo preg_replace('#(?:^[^\pL]*)|(?:[^\pL]*$)#u','',$string);

See it in action: http://3v4l.org/LeBXg

More about preg_quote.

Cheers,

Ardy

Upvotes: 0

Alex QLerR
Alex QLerR

Reputation: 56

maybe this will help :

these properties are usualy only available if PCRE is compiled with "--enable-unicode-properties"

http://docs.php.net/manual/en/regexp.reference.unicode.php#96479

Upvotes: 3

user1646111
user1646111

Reputation:

You can use alternative mb_ereg_replace() function:

mb_internal_encoding("UTF-8");
mb_regex_encoding("UTF-8");
echo mb_ereg_replace('#(?:^[^\pL]*)|(?:[^\pL]*$)#u','',$string);

Upvotes: 5

Related Questions