temuri
temuri

Reputation: 2807

Regexp to match up to the last occurence of symbol

I'm trying to figure out a PHP regexp.

Given multi-line string:

var Data = {
    'a': 1,
    'b': '{"a":[{"b":{"id":1}}]}'
}

var Data = {
    'a': 2,
    'b': '{"a":[{"b":{"id":2}}]}'
};

// Some other text here that may have }; or }. Blahblah blah.
// };
// }

I need the following two matches from the string above:

Data = {
    'a': 1,
    'b': '{"a":[{"b":{"id":1}}]}'
}

Data = {
    'a': 2,
    'b': '{"a":[{"b":{"id":2}}]}'
}

I've tried Data\s?=\s?{[^}]+};? but it matches:

Data = {
    'a': 1,
    'b': '{"a":[{"b":{"id":1}

Data = {
    'a': 2,
    'b': '{"a":[{"b":{"id":2}

Question: How do I change my regexp to achieve my goal?

Upvotes: 1

Views: 59

Answers (1)

revo
revo

Reputation: 48751

First if you are not sure about opening and closing braces if they occur in a an equal number, a general solution would be:

Data\s*=\s{(?:[^:}]*:.*\R+)+}

Live demo

Breakdown:

  • Data\s*=\s{ Match Data={ with optional spaces between
  • (?: Start of non-capturing group
    • [^:}]*:.*\R+ Match a line with following newline character
  • )+ Repeat as many as possible
  • } Match ending brace

PHP code (Demo):

preg_match_all('~Data\s*=\s{(?:[^:}]*:.*\R+)+}~', $str, $matches);
print_r($matches[0]);

But otherwise, to only refer to my comment, you are in need of subroutine calls and recursions that both PCRE have a support for:

Data\s*=\s*({(?:[^{}]*|(?1))*})

Live demo

Upvotes: 4

Related Questions