Reputation: 469
I have a string The quick brown {fox, dragon, dinosaur} jumps over the lazy {dog, cat, bear, {lion, tiger}}.
I want to get all string that are in between on curly braces. Curly braces inside curly braces must be ignored. The expected output in PHP array would be
[0] => fox, dragon, dinosaur
[1] => dog, cat, bear, {lion, tiger}
I tried this pattern \{([\s\S]*)\}
from Regex pattern extract string between curly braces and exclude curly braces answered by Mar but it seems this pattern get all string between curly braces without splitting non-related text (not sure the right word to use).
Here is the output of the pattern above
fox, jumps, over} over the lazy {dog, cat, bear, {lion, tiger}}
What is the best regex pattern to print the expected output from the sentence above?
Upvotes: 3
Views: 2310
Reputation: 476
As anubhava said, you can use a recursive pattern to do that.
However, his version is pretty "slow", and doesn't cover all cases.
I'd personnaly use this regex:
#({(?>[^{}]|(?0))*?})#
As you can see there: http://lumadis.be/regex/test_regex.php?id=2516 it is a -lot- faster; and matches more results.
So, how does it work?
/
( # capturing group
{ # looks for the char '{'
(?> # atomic group, engine will never backtrack his choice
[^{}] # looks for a non-'{}' char
| # or
(?0) # re-run the regex in a subroutine to match a subgroup
)*? # and does it as many time as needed
} # looks for the char '}'
) # ends the capture
/x
Adding the '?' to '*' makes it non-greedy. If you use a greedy quantifier there, the engine will start way more subroutine than it would with an ungreedy's one. (If you need more explanation, let me know)
Upvotes: 1
Reputation: 784998
You can use this recursive regex pattern in PHP:
$re = '/( { ( (?: [^{}]* | (?1) )* ) } )/x';
$str = "The quick brown {fox, dragon, dinosaur} jumps over the lazy {dog, cat, bear, {lion, tiger}}.";
preg_match_all($re, $str, $matches);
print_r($matches[2]);
Upvotes: 4