cmfolio
cmfolio

Reputation: 3443

Using REGEX with escaped quotes inside quotes

I have a PHP preg_match_all and REGEX question.

I have the following code:

<?php

$string= 'attribute1="some_value" attribute2="<h1 class=\"title\">Blahhhh</h1>"';

preg_match_all('/(.*?)\s*=\s*(\'|"|&#?\w+;)(.*?)\2/s', trim($string), $matches);

print_r($matches);

?>

That does not seem to pickup escaped quotes for the instance that I want to pass in HTML with quotes. I have tried numerous solutions for this with the basic quotes inside quotes REGEX fixes, but none seem to be working for me. I can't seem to place them correctly inside this pre-existing REGEX.

I am not a REGEX master, can someone please point me in the right direction?

The result I am trying to achieve is this:

Array
(
    [0] => Array
        (
            [0] => attribute1="some_value"
            [1] =>  attribute2="<h1 class=\"title\">Blahhhh</h1>"
        )

    [1] => Array
        (
            [0] => attribute1
            [1] =>  attribute2
        )

    [2] => Array
        (
            [0] => "
            [1] => "
        )

    [3] => Array
        (
            [0] => some_value
            [1] => <h1 class=\"title\">Blahhhh</h1>
        )
)

Thanks.

Upvotes: 4

Views: 143

Answers (1)

hakre
hakre

Reputation: 198217

You can solve this with a negative lookbehind assertion:

'/(.*?)\s*=\s*(\'|"|&#?\w+;)(.*?)(?<!\\\\)\2~/'
                                 ^^^^^^^^^

The closing quote should not be prepended by \. Gives me:

Array
(
    [0] => Array
        (
            [0] => attribute1="some_value"
            [1] =>  attribute2="<h1 class=\"title\">Blahhhh</h1>"
        )

    [1] => Array
        (
            [0] => attribute1
            [1] =>  attribute2
        )

    [2] => Array
        (
            [0] => "
            [1] => "
        )

    [3] => Array
        (
            [0] => some_value
            [1] => <h1 class=\"title\">Blahhhh</h1>
        )
)

This regex ain't perfect because it of the entity you but in there as delimiter, like the quotes it can be escaped as well with \. No idea if that is really intended.

See also this great question/answer: Split string by delimiter, but not if it is escaped.

Upvotes: 1

Related Questions