John Sly
John Sly

Reputation: 769

I can't get the results I'm looking for with regex and nested items

I'm working on some regex right now to isolate bracketed code such as this...

Regex: /\[(.*?)\]/

String: "<strong>[name]</strong>
<a href="http://www.example.com/place/[id]/">For more info...</a>"

Matched Fields: name, id

I'm looking to make this a bit more advanced. What I'm looking to do...

String: "[if:name <strong>[name]</strong>]
<a href="http://www.example.com/place/[id]/">For more info...</a>"

Matched Fields: if:name <strong>[name]</strong>, id

The problem is, I can't figure out any regex that'll work for this. I'm pretty sure I've killed the better half of my day, and I feel like I'm pretty close.

Here's what I've got at the moment that isn't doing what I want...

/\[([^\]]+)\]/

Anyone have any ideas?

Upvotes: 1

Views: 73

Answers (4)

user557597
user557597

Reputation:

This might help if you just want balanced brackets and/or recurse core's for inner brackets. Many nested levels could be done. This is just a framwork for a possible much more complex usage. The balanced text part is actually easier.

 # (?:(?>[^\\\[\]]+|(?:\\[\S\s])+)|(?>\[((?:(?&core)|))\]())|([\[\]])())(?:\2|\4)(?(DEFINE)(?<core>(?>[^\\\[\]]++|(?:\\[\S\s])++|\[(?:(?&core)|)\])+))

 (?:
      (?>
           [^\\\[\]]+ 
        |  
           (?: \\ [\S\s] )+
      )
   |  
      (?>
           \[
           (                       # (1) core content
                (?:
                     (?&core) 
                  |  
                )
           )
           \]
           ( )                     # (2) core flag
      )
   |  
      # unbalanced '[' or ']'
      ( [\[\]] )                   # (3) error content
      ( )                          # (4) error flag
 )

 (?: \2 | \4 )            # only let match if core flag or error flag is set
                          # this filters search to square brackets only
 (?(DEFINE)
      # core
      (?<core>
           (?>
                [^\\\[\]]++ 
             |  
                (?: \\ [\S\s] )++
             |  
                \[
                # recurse core
                (?:
                     (?&core) 
                  |  
                )
                \]
           )+
      )
 )


 # Perl sample, but regex should be valid in php
 # ----------------------------
 # use strict;
 # use warnings;
 # 
 # 
 # $/ = "";
 # 
 # my $data = <DATA> ;
 # 
 # parse( $data ) ;
 # 
 # 
 # sub parse
 # {
 #      my( $str ) = @_;
 #      while 
 #      (
 #           $ str =~ /
 #               (?:(?>[^\\\[\]]+|(?:\\[\S\s])+)|(?>\[((?:(?&core)|))\]())|([\[\]])())(?:\2|\4)(?(DEFINE)(?<core>(?>[^\\\[\]]++|(?:\\[\S\s])++|\[(?:(?&core)|)\])+))
 #           /xg 
 #      )
 #      
 #      {
 #           if ( defined $1 )
 #           {
 #                print "found core \[$1\] \n";
 #                parse( $1 ) ;
 #           }
 #           if ( defined $3 )
 #           {
 #                print "unbalanced error '$3' \n";
 #           }
 #           
 #      }     
 # }
 # __DATA__
 # 
 # this [ [ is a test
 # [ outter [ inner ] ]

Upvotes: 0

progrenhard
progrenhard

Reputation: 2363

\[(.*)\]

Regular expression visualization

Edit live on Debuggex

Upvotes: 0

buzzsawddog
buzzsawddog

Reputation: 662

Rather than using a Regex for html etc its easier to parse the file. Not sure what language your using so I will give an example of parser in Java. JSoup allows you to access the document using CSS selectors. Makes things so much easier! Take a look through the tutorials etc and see if that makes it easier.

Regex are nice and powerful dont get me wrong but give a parser a try.

Upvotes: 0

Jerry
Jerry

Reputation: 71558

PHP supports recursive syntax (like (?R)), so you can use this regex:

\[((?:[^\[\]]+|(?R))+)\]

regex101 demo

The results are: if:name <strong>[name]</strong>, id

(?R) is a repeat of the whole regex, hence 'recursive'. The other characters should be easy enough to understand, if not, regex101 provides quite a comprehensive description of the components of the regex :)

Upvotes: 2

Related Questions