BntMrx
BntMrx

Reputation: 2367

PHP nested regex

This string:

$subject = '\displaystyle{\ce{Cu^{2+}{(aq)}}+\ce{Zn{(s)}}\ce{->}\ce{Cu_{(s)}}+\ce{Zn^{2+}_{(aq)}}}'

I want to capture:

My regex inspired by PHP - help with my REGEX-based recursive function

$pattern = '#\\\\ce\{(?:[^{}]|(?R))*}#';

I tried with

preg_match_all($pattern, $subject, $matches);
print_r($matches);
Array
(
    [0] => Array
        (
            [0] => \ce{->}
        )
 )

But it doesn't work as you can see...

Upvotes: 2

Views: 164

Answers (2)

user557597
user557597

Reputation:

This works in my tests.
Note that \ce cannot be a sub-pattern without balanced braces.
So, this will fail \ce{Zn\cepp{(s)}},
and, this will pass \ce{Zn^{2+}\ce{Zn^{2+}_{(aq)}}_{(aq)}}
otherwise, why look for \ce{} in the first place ?

 #  '/\\\ce(\{(?:(?>(?!\\\ce)[^{}])+|(?R)|(?1))*\})/'

 \\ce
 (                  # (1 start)
      \{
      (?:
           (?>
                (?! \\ce )         # Not '\ce' ahead
                [^{}]              # A char, but not { or }
           )+
        |                   # or,
           (?R)               # Recurse whole expression
        |                   # or,
           (?1)               # Recurse group 1
      )*
      \}                
 )                  # (1 end)

Upvotes: 0

anubhava
anubhava

Reputation: 785156

Your can use this recursive regex:

(\\ce(\{(?:[^{}]|(?-1))*\}))

RegEx Demo

Here (?-1) recurses the 2nd subpattern that starts after \\ce.

Code:

$re = "/(
  \\\\ce
  (
    \\{
    (?:[^{}]|(?-1))*
    \\}
  )
)/x"; 

$str = 
 "\displaystyle{\ce{Cu^{2+}{(aq)}}+\ce{Zn{(s)}}\ce{->}\ce{Cu_{(s)}}+\ce{Zn^{2+}_{(aq)}}}"; 

if ( preg_match_all($re, $str, $m) )
   print_r($m[1]);

Output:

Array
(
    [0] => \ce{Cu^{2+}{(aq)}}
    [1] => \ce{Zn{(s)}}
    [2] => \ce{->}
    [3] => \ce{Cu_{(s)}}
    [4] => \ce{Zn^{2+}_{(aq)}}
)

Upvotes: 4

Related Questions