Kendall Hopkins
Kendall Hopkins

Reputation: 44104

Find PHP with REGEX

I need a REGEX that can find blocks of PHP code in a file. For example:

    <? print '<?xml version="1.0" encoding="UTF-8"?>';?>
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
    <head>
        <?php echo "stuff"; ?>
    </head>
    <html>

When parsed would by the REGEX would return:

array(
    "<? print '<?xml version=\"1.0\" encoding="UTF-8"?>';?>",
    "<? echo \"stuff\"; ?>"
);

You can assume the PHP is valid.

Upvotes: 0

Views: 178

Answers (5)

Gumbo
Gumbo

Reputation: 655369

With token_get_all you get a list of PHP language tokens of a given PHP code. Then you just need to iterate the list, look for the open tag tokens and for the corresponding close tags.

$blocks = array();
$opened = false;
foreach (token_get_all($code) as $token) {
    if (!$opened) {
        if (is_array($token) && ($token[0] === T_OPEN_TAG || $token[0] === T_OPEN_TAG_WITH_ECHO)) {
            $opened = true;
            $buffer = $token[1];
        }
    } else {
        if (is_array($token)) {
            $buffer .= $token[1];
            if ($token[0] === T_CLOSE_TAG) {
                $opened = false;
                $blocks[] = $buffer;
            }
        } else {
            $buffer .= $token;
        }
    }
}

Upvotes: 7

Liwen
Liwen

Reputation: 937

<\?(?:php)?\s+.*?\?>$

with the following modifiers:

Dot match newlines

^& match at line breaks

Upvotes: 0

turbod
turbod

Reputation: 1988

Try this regex(untested):

preg_match_all('@<\?.*?\?>@si',$html,$m);
print_r($m[0]);

Upvotes: 0

Mike Dinescu
Mike Dinescu

Reputation: 55730

This is the type of task that is much better suited for a custom parser. You could relatively easily construct one using a stack and I can guarantee you will be done much quicker and pull less hair out than you would trying to debug your regex.

Regular expressions are great tools when used appropriately but not all text parsing tasks are equal.

Upvotes: 2

Jason McCreary
Jason McCreary

Reputation: 73001

Try the following regex using preg_match()

/<\?(?:php)?\s+(.*?)\?>/

That's untested, but is a start. It assumes a closing PHP tag (arguably well-formed).

Upvotes: 0

Related Questions