J.Doe
J.Doe

Reputation: 137

regex inside ruby

i have a rather simple regex expression (irony off) and ruby is treating it differently as expected

string = puts worksheet.sheet_data[5][10].value

string.split(/(?>(?>\([^()]*(?R)?[^()]*\))|(?>\[[^[\]]*(?R)?[^[\]]*\])|(?>{[^{}]*(?R)?[^{}]*})|(?>"[^"]*")|(?>[^(){}[\]", ]+))(?>[ ]*(?R))*/)

I already took out the (?R) and replaced it with \g<1> but after running it I still get the following error: premature end of char-class:

I got told that i need to escape some closing brackets because [^()] in ruby gets treated as if ] is still part of the set so i have to change it to [^()\]. I did all of that and my regex looks like this now:

string.split(/(?>(?>\([^()\]*\g<1>?[^()\]*\))|(?>\[[^[]\]*\g<1>?[^[]\]*])|(?>{[^{}\]*\g<1>?[^{}\]*})|(?>"[^"\]*")|(?>[^(){}[]", \]+))(?>[ \]*\g<1>)*/) 

Its basically the same just that I removed previous \] escaping characters because ruby treats them as escaped anyway and added \ to closing brackets where there was none. Ruby still throws the same exception. I tried the regex previously on regexr.com so it must work.

EDIT:

the sample text is attribute1, attribute2 (further specification,(even further specification, etc), another specification), attribute3, attribute4

I should get attribute1, attribute2(further specification,(even further specification, etc), another specification), attribute3, attribute4

The commas inside parantheses should be ignored

Upvotes: 1

Views: 83

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627468

Instead of \g<1>, you need \g<0> since \g<1> recurses Capturing group #1 pattern, and (?R) recurses the whole regex pattern (and the whole pattern is Group 0).

Make sure you escape [ and ] inside character classes, they are special there in the Onigmo regex library.

You need

(?>(?>\([^()]*\g<0>?[^()]*\))|(?>\[[^\[\]]*\g<0>?[^\[\]]*\])|(?>{[^{}]*\g<0>?[^{}]*})|"[^"]*"|[^(){}\[\]", ]+)(?>[ ]*\g<0>)*

See the Rubular demo.

Upvotes: 2

Related Questions