Reputation: 198
It appears that PHP's preg_match
has a 3276 character limit for matching repeating characters in some cases.
i.e.
^(.|\s){0,3276}$
works, but ^(.|\s){0,3277}$
does not.
It doesn't seem to always apply, as /^(.){0,3277}$/
works.
I can't find this mentioned anywhere in PHP's documentation or the bug tracker. The number 3276 seems a bit of an odd boundary, the only thing I can think of is that it's approximately 1/10th of 32767, which is the limit for a signed 16-bit integer.
preg_last_error()
returns 0.
I've reproduced the issue on http://www.phpliveregex.com/ as well as my local system and the webserver.
EDIT: Looks like we're getting "Warning: preg_match(): Compilation failed: regular expression is too large at offset 16" out of the code, so it appears to be the same issue as PHP preg_match_all limit.
However, the regex itself isn't very large... Does PHP do some kind of expansion when you have repeating groups that's making it too large?
Upvotes: 7
Views: 4395
Reputation: 146660
In order to handle Perl-compatible regular expressions, PHP just bundles a third-party library that takes care of the job. The behaviour you describe is actually documented:
The "*" quantifier is equivalent to {0,} , the "+" quantifier to {1,} , and the "?" quantifier to {0,1} . n and m are limited to non-negative integral values less than a preset limit defined when perl is built. This is usually 32766 on the most common platforms.
So there's always a hard limit. Why do your tests suggest that PHP limit is 10 times smaller than the typical one? No idea about that :)
Upvotes: 1