sleepless_in_seattle
sleepless_in_seattle

Reputation: 2214

Ignore characters in preg_match_all output

I have this regex:

preg_match_all('/{.*?}/', $html, $matches);

Which returns all strings that are written inside curly braces. The $matches variable contains the { and } characters also. How can I remove them?

I don't want to do:

if ($matches[0] == "{variable}")

And I don't want to add ( and ) characters to the regexp because I don't want to use:

preg_match_all('/{(.*?)}/', $html, $matches);
if ($matches[0][0] == "variable")

So is there a simpler way to remove the curly braces from the $matches within the regex?

Upvotes: 4

Views: 942

Answers (3)

Jonny 5
Jonny 5

Reputation: 12389

Or reset after the { and match characters, that are not }. If {} are balanced, don't need another }

{\K[^}]*

See example on regex101

Upvotes: 3

Sam
Sam

Reputation: 20486

In PCRE (PHP's implementation of regex), you can use lookarounds to do zero-length assertions. A lookbehind, (?<=...), will make sure that expression occurs behind the internal pointer. A lookahead, (?=...), will make sure that expression occurs ahead of the internal pointer. These can both be negated if need be: (?<!...) or (?!...).


This brings us to this expression:

(?<={).*?(?=})

Demo


Implement it the same way:

preg_match_all('/(?<={).*?(?=})/', $html, $matches);
// $matches[0] = 'variable';

@CasimirEtHippolyte makes a good point. This is a great example of where a lazy dot-match-all is not necessary and will potentially decrease performance with backtracking. You can replace the .*? with [^}]* to match 0+ non-} characters.

Upvotes: 7

vks
vks

Reputation: 67968

(?<={).*?(?=})

Replace your regex by this.This will work.

Upvotes: 2

Related Questions