Reputation: 7560
I'd like to exclude either one or another character with a RegEx.
I have a RegEx that searches the pattern \[([^\[]+\]\=\>(.*)
.
My problem is the last capture pattern. The string following the >
should either be followed by a comma or a right parenthesis.
This is my text: Array([0]=>123,[1]=>Array([a]=>1,[b]=>2))
and I want to get:
// match 1
0 = 0
1 = 123
// match 2
0 = 1
1 = Array([a]=>1,[b]=>2)
This is my RegEx: \[([^\[]+)\]\=\>([^,\)]+)\)?
but I get:
// match 1
0 = 0
1 = 123
// match 2
0 = 1
1 = Array([a]=>1
// match 3
0 = b
1 = 2
Upvotes: 1
Views: 478
Reputation: 4325
The character class [^,\)]
explicitly excludes the comma, so it will never match Array([a]=>1,[b]=>2)
.
If you are OK with having only one level of nesting, you can try the following:
\[([^\]]+)\]=>(Array\([^\)]+\)|[^,\)]+)?
If you want to have arbitarily nested definitions of Array
, this problem cannot be solved by using regular expressions, because the language you want to parse is not a regular language. You should use a parser generator or write a recursive-descent parser which implements the following grammar:
Start : Array
Array : "Array" "(" ElementList ")"
ElementList : "" | Elements
Elements : Element | Element "," Elements
Element : "[" String "]" "=>" Value
Value : Number | Array
Number : [1-9][0-9]*
String : [^\]]+
Try looking for parser generators for JavaScript. PEG.js is an exmaple: http://pegjs.majda.cz/
Upvotes: 5