Julian F. Weinert
Julian F. Weinert

Reputation: 7560

RegEx exclude one or another character

I'd like to exclude either one or another character with a RegEx. I have a RegEx that searches the pattern \[([^\[]+\]\=\>(.*).

My problem is the last capture pattern. The string following the > should either be followed by a comma or a right parenthesis.

This is my text: Array([0]=>123,[1]=>Array([a]=>1,[b]=>2)) and I want to get:

// match 1
0 = 0
1 = 123

// match 2
0 = 1
1 = Array([a]=>1,[b]=>2)

This is my RegEx: \[([^\[]+)\]\=\>([^,\)]+)\)? but I get:

// match 1
0 = 0
1 = 123

// match 2
0 = 1
1 = Array([a]=>1

// match 3
0 = b
1 = 2

Upvotes: 1

Views: 478

Answers (2)

Krzysztof Kosiński
Krzysztof Kosiński

Reputation: 4325

The character class [^,\)] explicitly excludes the comma, so it will never match Array([a]=>1,[b]=>2).

If you are OK with having only one level of nesting, you can try the following: \[([^\]]+)\]=>(Array\([^\)]+\)|[^,\)]+)?

If you want to have arbitarily nested definitions of Array, this problem cannot be solved by using regular expressions, because the language you want to parse is not a regular language. You should use a parser generator or write a recursive-descent parser which implements the following grammar:

Start : Array
Array : "Array" "(" ElementList ")"
ElementList : "" | Elements
Elements : Element | Element "," Elements
Element : "[" String "]" "=>" Value
Value : Number | Array
Number : [1-9][0-9]*
String : [^\]]+

Try looking for parser generators for JavaScript. PEG.js is an exmaple: http://pegjs.majda.cz/

Upvotes: 5

M21B8
M21B8

Reputation: 1887

Regex OR syntax is a pipe | e.g. "a|b" will batch a or b

Upvotes: 0

Related Questions