Abhishek Jain
Abhishek Jain

Reputation: 4518

Nested Regex matching in Java

I have a scenario where I want to do a nested expression matching in Java.

Consider the following expression:

SUM_ALL(2:3,4:5)>20

where SUM_ALL has a reserved operator meaning in the application. Now, I want to extract the operator name and its arguments from a given expression. For doing the same, I have defined my pattern expression as follows:

Pattern testPattern = Pattern.compile("[^a-zA-Z]*([a-zA-Z_]+)\\s*\\(\\s*([0-9:,]+)\\s*\\).*");

This works fine if the expression is limited simply to the above. Here is the output for the same:

Group 1: SUM_ALL
Group 2: 2:3,4:5

Now, in a given expression, I may not be aware of the number of such operators present. For example consider the following case:

SUM_ALL(4:5,6:7)>MAX(2:3,4:4)+MIN(3:4,5:7)

Now, I want to extract each of the above operators and their respective arguments to perform the calculation according to their reserved meaning and then evaluate a simple math expression.

If there were a nesting capability in the Java pattern matcher it would have helped to extract the operators one by one by considering the rest of the expression once an operator is resolved. I know it is possible to do it by capturing the rest of the expression in a separate group and then running the matcher on that group value and keep doing it until we reach the end of the expression, but I would be more interested to know if the pattern-matcher has an inherent functionality for the same.

Upvotes: 1

Views: 4889

Answers (2)

anubhava
anubhava

Reputation: 784968

You can use code like this:

String str = "SUM_ALL(4:5,6:7)>MAX(2:3,4:4)+MIN(3:4,5:7)";
Matcher m = 
    Pattern.compile("(?i).*?([a-z_]+)\\s*\\(\\s*([\\d:,]+)\\s*\\)").matcher(str);
while (m.find())
   System.out.printf("%s :: %s%n", m.group(1), m.group(2));

OUTPUT:

SUM_ALL :: 4:5,6:7
MAX :: 2:3,4:4
MIN :: 3:4,5:7

Upvotes: 1

Javier Diaz
Javier Diaz

Reputation: 1830

Well, have this:

(?:(SUM_ALL|MAX|MIN|addmorehere)\\(((?:\d+:\d+,?){2})\\)[+-><*/addmorehere]?)+

It's not really scaped for java or any language but you get the idea

Upvotes: 1

Related Questions