Max Dove
Max Dove

Reputation: 115

Regex for parsing complicated array

I'm trying parse input string which looks like array[digit or expression or array, digit or expression or array] So I need to get values in [ , ]. I was trying to get them using this regex:

(array1)\[(.*)\,(.*)\]

to get values of (.*) capturing groups, but it doen't work, because it's greedy quantifier, so in the case of:

array1[ array2[4,3] , array2[1,6] ]

I will get array2[4,3] , array2[1, as first capturing group and 6 as a second which is not right.

How can I get array2[4,3] as first and array2[1,6] as second capturing group? Or array2[array3[1,1],3] and 5+3 if the input string is array1[ array2[array3[1,1],3] , 5+3 ]?

Upvotes: 1

Views: 127

Answers (1)

Jerry
Jerry

Reputation: 71538

You can make use of balancing groups:

array\d*\[\s*((?:[^\[\]]|(?<o>\[)|(?<-o>\]))+(?(o)(?!))),\s*((?:[^\[\]]|(?<o>\[)|(?<-o>\]))+(?(o)(?!)))\]

ideone demo on your last string.

A breakdown:

array\d*\[\s*    # Match array with its number (if any), first '[' and any spaces
(
  (?:                 
    [^\[\]]      # Match all non-brackets
  |
    (?<o>\[)     # Match '[', and capture into 'o' (stands for open)
  |
    (?<-o>\])    # Match ']', and delete the 'o' capture
  )+
  (?(o)(?!))     # Fails if 'o' doesn't exist
)
,\s*             # Match comma and any spaces
(                # Repeat what was above...
  (?:            
    [^\[\]]      # Match all non-brackets
  |
    (?<o>\[)     # Match '[', and capture into 'o' (stands for open)
  |
    (?<-o>\])    # Match ']', and delete the 'o' capture
  )+
  (?(o)(?!))     # Fails if 'o' doesn't exist
)
\]               # Last closing brace

Upvotes: 3

Related Questions