simon
simon

Reputation: 1191

PHP regex not working - returns NULL on local server, but works properly on other

I have following regex:

/\{\s?joomla-tag\s+(.*<+.+>+.*)\s?\}/is

and the following code:

$regex = "/\{\s?joomla-tag\s+(.*<+.+>+.*)\s?\}/is";
$replace = '<div class="someclass">$1</div>';
$text = preg_replace( $regex, $replace, $text );

But, unfortunately, it cannot match the following code (nevertheless it should):

.... many html lines .......
<p>123{joomla-tag Lore<strong>m</strong> ip</p>
<p>sum dolor sit amet}</p>
.... many html lines .......

See the real sample: http://pastebin.com/WSQyrmxd

What's wrong: regular expression or something else? Could you please advise the correct variant? In RegExr, everything works smoothly, but not in PHP.

On a local server, i simply get NULL after preg_replace


EDIT: Finally I found a solution: (thanks, sg3s, for an idea) http://www.pelagodesign.com/blog/2008/01/25/wtf-preg_replace-returns-null/

Upvotes: 2

Views: 1331

Answers (3)

Alan Moore
Alan Moore

Reputation: 75242

You say you solved the problem, but if your solution was to increase the backtrack_limit setting, that's not a solution. In fact, you're probably setting yourself up for bigger problems later on. You need to find out why it's doing so much backtracking.

After \{\s?joomla-tag\s+ locates the beginning of the tag, the first .* initially gobbles up the remainder of the document. Then it starts backing off, trying to let the rest of the regex match. When it reaches a point where <+ can match, the .+ again consumes the rest of the document, and another wave of backtracking begins. And with yet another .* after that, you're making it do a ridiculous amount of unnecessary work.

This is the reason for the rule of thumb,

Don't use the dot metacharacter (especially .* or .+) if you can use something more specific. If you do use the dot, don't use it in single-line or DOTALL mode (i.e., the /s modifier or its inline, (?s) form).

In this case, you know the match should end at the next closing brace (}), so don't let it match any braces before that:

\{\s?joomla-tag\s+([^}]*)\}

Upvotes: 5

ridgerunner
ridgerunner

Reputation: 34395

Sounds like this may be a: pcre.recursion_limit error due to the PCRE regex engine running out of stack. I've seen this before (but typically the symptoms are more severe - i.e. completely crashing the webserver!) Note that this class of problem will frequently manifest symptoms on a local server and not a remote server, particularly if the local system is running Apache under Windows (The Win32 build of httpd.exe has only 256KB of stack space).

preg_replace() returns NULL when it encounters an error in the PCRE library. You can use the preg_last_error() function to get the last error and print out a message like so:

   $pcre_err = preg_last_error();  // PHP 5.2 and above.
    if ($pcre_err === PREG_NO_ERROR) {
        $msg = 'Successful non-match.';
    } else {
        // preg_match error!
        switch ($pcre_err) {
            case PREG_INTERNAL_ERROR:
                $msg = 'PREG_INTERNAL_ERROR';
                break;
            case PREG_BACKTRACK_LIMIT_ERROR:
                $msg = 'PREG_BACKTRACK_LIMIT_ERROR';
                break;
            case PREG_RECURSION_LIMIT_ERROR:
                $msg = 'PREG_RECURSION_LIMIT_ERROR';
                break;
            case PREG_BAD_UTF8_ERROR:
                $msg = 'PREG_BAD_UTF8_ERROR';
                break;
            case PREG_BAD_UTF8_OFFSET_ERROR:
                $msg = 'PREG_BAD_UTF8_OFFSET_ERROR';
                break;
            default:
                $msg = 'Unrecognized PREG error';
                break;
        }
    }
    echo($msg);

I've explained this error in detail with answers to related questions. See:

RegExp in preg_match function returning browser error

PHP regex: is there anything wrong with this code?

Minifying final HTML output using regular expressions with CodeIgniter

Good luck!

Upvotes: 4

poke
poke

Reputation: 388103

It works for me.

Note that from an HTML standpoint, your replacement does not create a valid structure.

Using the full text

It still works for me, even with the provided full HTML example. So there has to be somethign wrong with your other code; you might want to enable full error output to see if there’s some other issue.

Upvotes: 2

Related Questions