Reputation: 2579
I'm learning regex and need to get all possible matches for a pattern out of a string.
If my input is:
case a
when cond1
then stmt1;
when cond2
then stmt2;
end case;
I need to get the matches which have groups as follows
Group1:
"cond1"
"stmt1;"
and Group2:
"cond2"
"stmt2;"
Is it possible to get such groups using any regex?
Upvotes: 1
Views: 18014
Reputation:
I don't think this is possible, primarily because any group that matches when...then... is going to match all of them, creating multiple captures within the same group.
I'd suggest using this regex:
(?:when(.*)\nthen(.*)\n)+?
which results in:
Match 1:
* Group 1: cond1
* Group 2: stmt1;
Match 2:
* Group 1: cond2
* Group 2: stmt2;
Upvotes: 1
Reputation: 121772
If this was written in java I would write two patterns for the parser, one to match the cases and one to match the when-then cases. Here is how the latter could be written:
CharSequence buffer = inputString.subSequence(0, inputString.length());
// inputString is the string you get after matching the case statements...
Pattern pattern = Pattern.compile(
"when (\\S+).*"
+ "then (\\S+).*");
Matcher matcher = pattern.matcher(buffer);
while (matcher.find()) {
DoWhenThen(matcher.group(1), matcher.group(2));
}
Note: I haven't tested this code as I'm not 100% sure on the pattern... but I'd be tinkering around this.
Upvotes: 0
Reputation: 84683
It's possible to use regex for this provided that you don't nest your statements. For example if your stmt1 is another case statment then all bets are off (you can't use regex for something like that, you need a regular parser).
Edit: If you really want to try it you can do it with something like (not tested, but you get the idea):
Regex t = new Regex(@"when\s+(.*?)\s+then\s+(.*?;)", RegexOptions.Singleline)
allMatches = t.Matches(input_string)
But as I said this will work only for not nested statements.
Edit 2: Changed a little the regex to include the semicolon in the last group. This will not work as you wanted - instead it will give you multiple matches and each match will represent one when condition, with the first group the condition and the second group the statement.
I don't think you can build a regex that does exactly what you want, but this should be close enough (I hope).
Edit 3: New regex - should handle multiple statements
Regex t = new Regex(@"when\s+(.*?)\s+then\s+(.*?)(?=(when|end))", RegexOptions.Singleline)
It contains a positive lookahead so that the second group matches from then to the next 'when' or 'end'. In my test it worked with this:
case a
when cond1
then stmt1;
stm1;
stm2;stm3
when cond2
then stmt2;
aaa;
bbb;
end case;
It's case sensitive for now, so if you need case insensitivity you need to add the corresponding regex flag.
Upvotes: 6