Reputation: 7102
Here is the code I am trying to run:
$str = 'a,b,c,d';
return preg_split('/(?<![^\\\\][\\\\]),/', $str);
As you can see, the regexp being used here is:
/(?<![^\\][\\]),/
Which is a simple fixed-length negative lookbehind for "preceded by something that isn't a backslash, then something that is!".
This regex works just fine on http://www.phpliveregex.com
But when I go and actually attempt to run the above code, I am spat back the error:
Warning: preg_split() [function.preg-split]: Compilation failed: lookbehind assertion is not fixed length at offset 13
To make matters worse, a fellow programmer tested the code on his 5.4.24 PHP server, and it worked fine.
This leads me to believe that my issues are related to the configuration of my server, which I have very little control over. I am told that my PHP version if 5.2.*
Are there any workarounds/alternatives to preg_replace() that might not have this issue?
Upvotes: 6
Views: 3757
Reputation: 106443
The problem is caused by the bug fixed in PCRE 6.7. Quoting the changelog:
A negated single-character class was not being recognized as fixed-length in lookbehind assertions such as
(?<=[^f])
, leading to an incorrect compile error"lookbehind assertion is not fixed length"
PCRE 6.7 was introduced in PHP 5.2.0, in Nov 2006. As you still have this bug, it means it's not still there at your server - so for a preg-split based workaround you have to use a pattern without a negative character class. For example:
$patt = '/(?<!(?<!\\\\)\\\\),/';
// or...
$patt = '/(?<![\x00-\x5b\x5d-\xFF]\x5c),/';
However, I find the whole approach a bit weird: what if ,
symbol is preceded by exactly three backslashes? Or five? Or any odd number of them? The comma in this case should be considered 'escaped', but obviously you cannot create a lookbehind expression of variable length to cover these cases.
On the second thought, one can use preg_match_all
instead, with a common alternation trick to cover the escaped symbols:
$str = 'e ,a\\,b\\\\,c\\\\\\,d\\\\';
preg_match_all('/(?:[^\\\\,]|\\\\(?:.|$))+/', $str, $matches);
var_dump($matches[0]);
Demo.
I really think I covered all the issues here, those trailing slashes were a killer )
Upvotes: 3
Reputation: 89584
Way to avoid the negated character class (I write \x5c
instead of a lot of backslashes to be more clear)
$result = preg_split('/(?<!(?!\x5c).\x5c),/s', $str);
About the approach itself:
If you are trying to split on comma that are not escaped, you are in the wrong way with a lookbehind since you can't check and undefined number of backslash before the comma. You have several possibilities to solve this problem:
$result = preg_split('/(?:[^\x5c]|\A)(?:\x5c.)*\K,/s', $str);
or
$result = preg_split('/(?<!\x5c)(?:\x5c.)*\K,/s', $str);
or for PHP > 5.2.4
$result = preg_split('/\x5c{2}(*SKIP)(?!)|(?<!\x5c),/s', $str);
Upvotes: 1
Reputation: 31045
I think you are using an older php version since I your error rises on PHP 5.1.6 or lower.
You can check a non working demo here
On the other hand it works for PHP 5.2.16 or higher:
Upvotes: 0