Obadyah Anthony
Obadyah Anthony

Reputation: 65

Regex grab all text between brackets, and NOT in quotes

I'm attempting to match all text between {brackets}, however not if it is in quotation marks: For example:

$str = 'value that I {want}, vs value "I do {NOT} want" '

my results should snatch "want", but omit "NOT". I've searched stackoverflow desperately for the regex that could perform this operation with no luck. I've seen answers that allow me to get the text between quotes but not outside quotes and in brackets. Is this even possible?

And if so how is it done?

So far this is what I have:

preg_match_all('/{([^}]*)}/', $str, $matches);

But unfortunately it only gets all text inside brackets, including {NOT}

Upvotes: 3

Views: 1822

Answers (2)

HamZa
HamZa

Reputation: 14921

It's quite tricky to get this done in one go. I even wanted to make it compatible with nested brackets so let's also use a recursive pattern :

("|').*?\1(*SKIP)(*FAIL)|\{(?:[^{}]|(?R))*\}

Ok, let's explain this mysterious regex :

("|')                   # match eiter a single quote or a double and put it in group 1
.*?                     # match anything ungreedy until ...
\1                      # match what was matched in group 1
(*SKIP)(*FAIL)          # make it skip this match since it's a quoted set of characters
|                       # or
\{(?:[^{}]|(?R))*\}     # match a pair of brackets (even if they are nested)

Online demo

Some php code:

$input = <<<INP
value that I {want}, vs value "I do {NOT} want".
Let's make it {nested {this {time}}}
And yes, it's even "{bullet-{proof}}" :)
INP;

preg_match_all('~("|\').*?\1(*SKIP)(*FAIL)|\{(?:[^{}]|(?R))*\}~', $input, $m);

print_r($m[0]);

Sample output:

Array
(
    [0] => {want}
    [1] => {nested {this {time}}}
)

Upvotes: 6

jszobody
jszobody

Reputation: 28911

Personally I'd process this in two passes. The first to strip out everything in between double quotes, the second to pull out the text you want.

Something like this perhaps:

$str = 'value that I {want}, vs value "I do {NOT} want" ';

// Get rid of everything in between double quotes
$str = preg_replace("/\".*\"/U","",$str);

// Now I can safely grab any text between curly brackets
preg_match_all("/\{(.*)\}/U",$str,$matches);

Working example here: http://3v4l.org/SRnva

Upvotes: 3

Related Questions