Reputation: 1272
My program parses SQL VALUES multi-row string into single-row string array.
Typical input string looks like:
(11,'-1','Service A (nested parentheses)','en') (22,'-2','Service B (nested parentheses)','en')
Desired output:
11,'-1','Service A (nested parentheses)','en'
22,'-2','Service B (nested parentheses)','en'
I have tried following regexp, with partial luck only:
\(('.*?'|.*?)\)
What would be the right way to handle this in regexp?
EDIT:
Upvotes: 5
Views: 6267
Reputation: 41838
EDIT: After your comment about smilies, I'll suggest an alternative approach:
(?<=\()(?:'[^']*'|[,\s]+|\d+)+(?=\))
See demo. This assumes that your tokens are either strings delimited by single quotes, or digits. Is that correct?
Original Answer
With one potential level of nesting, this will work in most regex flavors, including Java:
(?<=\()(?:[^()]+|\([^)]+\))+
See demo
How does it work?
(
+
quantifier matches one or more of: (i) any number of characters that are not opening or closing parentheses, OR |
(ii) full (parenthesized expressions)
If you want to make sure that the container is balanced, add a lookahead at the end:
(?<=\()(?:[^()]+|\([^)]+\))+(?=\))
Upvotes: 2
Reputation: 1161
pattern.compile("\\(((?:'[^']*'|[^'\\(\\)]+)+)\\)");
RegexPlanet click the Java
link.
The meat of the regex is '[^']*'|[^'\(\)]
- any series of any characters surrounded by single quotations OR any string of characters excluding single quotes and round brackets. This avoids having to use look arounds, although the look around suggested by Casimir et Hippolyte may in fact be more efficient (I am not particularly familiar with the performance aspect of look arounds in Java).
Upvotes: 1
Reputation: 16687
With caveats:
/\(.*\)/\1/
Will remove the surrounding parenthesis, and
/\) \(/\r/g
Will put in newlines as in your example
Caveats:
Upvotes: 0