Reputation: 1191
I have following regex:
/\{\s?joomla-tag\s+(.*<+.+>+.*)\s?\}/is
and the following code:
$regex = "/\{\s?joomla-tag\s+(.*<+.+>+.*)\s?\}/is";
$replace = '<div class="someclass">$1</div>';
$text = preg_replace( $regex, $replace, $text );
But, unfortunately, it cannot match the following code (nevertheless it should):
.... many html lines .......
<p>123{joomla-tag Lore<strong>m</strong> ip</p>
<p>sum dolor sit amet}</p>
.... many html lines .......
See the real sample: http://pastebin.com/WSQyrmxd
What's wrong: regular expression or something else? Could you please advise the correct variant? In RegExr, everything works smoothly, but not in PHP.
On a local server, i simply get NULL after preg_replace
EDIT: Finally I found a solution: (thanks, sg3s, for an idea) http://www.pelagodesign.com/blog/2008/01/25/wtf-preg_replace-returns-null/
Upvotes: 2
Views: 1331
Reputation: 75242
You say you solved the problem, but if your solution was to increase the backtrack_limit
setting, that's not a solution. In fact, you're probably setting yourself up for bigger problems later on. You need to find out why it's doing so much backtracking.
After \{\s?joomla-tag\s+
locates the beginning of the tag, the first .*
initially gobbles up the remainder of the document. Then it starts backing off, trying to let the rest of the regex match. When it reaches a point where <+
can match, the .+
again consumes the rest of the document, and another wave of backtracking begins. And with yet another .*
after that, you're making it do a ridiculous amount of unnecessary work.
This is the reason for the rule of thumb,
Don't use the dot metacharacter (especially
.*
or.+
) if you can use something more specific. If you do use the dot, don't use it in single-line or DOTALL mode (i.e., the/s
modifier or its inline,(?s)
form).
In this case, you know the match should end at the next closing brace (}
), so don't let it match any braces before that:
\{\s?joomla-tag\s+([^}]*)\}
Upvotes: 5
Reputation: 34395
Sounds like this may be a: pcre.recursion_limit
error due to the PCRE regex engine running out of stack. I've seen this before (but typically the symptoms are more severe - i.e. completely crashing the webserver!) Note that this class of problem will frequently manifest symptoms on a local server and not a remote server, particularly if the local system is running Apache under Windows (The Win32 build of httpd.exe
has only 256KB of stack space).
preg_replace()
returns NULL
when it encounters an error in the PCRE library. You can use the preg_last_error()
function to get the last error and print out a message like so:
$pcre_err = preg_last_error(); // PHP 5.2 and above.
if ($pcre_err === PREG_NO_ERROR) {
$msg = 'Successful non-match.';
} else {
// preg_match error!
switch ($pcre_err) {
case PREG_INTERNAL_ERROR:
$msg = 'PREG_INTERNAL_ERROR';
break;
case PREG_BACKTRACK_LIMIT_ERROR:
$msg = 'PREG_BACKTRACK_LIMIT_ERROR';
break;
case PREG_RECURSION_LIMIT_ERROR:
$msg = 'PREG_RECURSION_LIMIT_ERROR';
break;
case PREG_BAD_UTF8_ERROR:
$msg = 'PREG_BAD_UTF8_ERROR';
break;
case PREG_BAD_UTF8_OFFSET_ERROR:
$msg = 'PREG_BAD_UTF8_OFFSET_ERROR';
break;
default:
$msg = 'Unrecognized PREG error';
break;
}
}
echo($msg);
I've explained this error in detail with answers to related questions. See:
RegExp in preg_match function returning browser error
PHP regex: is there anything wrong with this code?
Minifying final HTML output using regular expressions with CodeIgniter
Good luck!
Upvotes: 4
Reputation: 388103
It works for me.
Note that from an HTML standpoint, your replacement does not create a valid structure.
It still works for me, even with the provided full HTML example. So there has to be somethign wrong with your other code; you might want to enable full error output to see if there’s some other issue.
Upvotes: 2