Reputation: 5655
I was curious on how it would be possible to split mathematical equations with parenthesis meaningfully using java's string regex. It's hard to explain without an example, one is below.
A generic solution pattern would be appreciated, rather than one which just works for the example provided below.
String s = "(5 + 6) + (2 - 18)";
// I want to split this string via the regex pattern of "+",
// (but only the non-nested ones)
// with the result being [(5 + 6), (2 - 18)]
s.split("\\+"); // Won't work, this will split via every plus.
What I'm mainly looking for is first level splitting, I want a regex check to see if a symbol like "+" or "-" is nested in any form, if it is, don't split it, if it isn't split it. Nesting can be in the form of () or [].
Thank you.
Upvotes: 0
Views: 771
Reputation: 311048
You can't know that you will never get more than one level of parentheses, and you can't analyze recursive syntax with a regular expression, by definition. You need to use or write a parser. Have aloo, around for the Dijkstra Shunting Yard Algorithm, or a recursive descent expression parser, or a library that will do either,
Upvotes: 0
Reputation: 2555
If you don't expect splitting nested expressions like ((6 + 5)-4), I have a pretty simple function to split the expressions without using regular expressions :
public static String[] subExprs(String expr) {
/* Actual logic to split the expression */
int fromIndex = 0;
int subExprStart = 0;
ArrayList<String> subExprs = new ArrayList<String>();
again:
while ((subExprStart = expr.indexOf("(", fromIndex)) != -1) {
fromIndex = subExprStart;
int substringEnd=0;
while((substringEnd = expr.indexOf(")", fromIndex)) != -1){
subExprs.add(expr.substring(subExprStart, substringEnd+1));
fromIndex = substringEnd + 1;
continue again;
}
}
/* Logic only for printing */
System.out.println("Original expression : " + expr);
System.out.println();
System.out.print("Sub expressions : [ ");
for (String string : subExprs) {
System.out.print(string + ", ");
}
System.out.print("]");
String[] subExprsArray = {};
return subExprs.toArray(subExprsArray);
}
Sample output :
Original expression : (a+b)+(5+6)+(57-6)
Sub expressions : [ (a+b), (5+6), (57-6), ]
EDIT
For the extra condition of also getting expressions enclosed in []
, this code will handle expressions inside both ()
and []
.
public static String[] subExprs(String expr) {
/* Actual logic to split the expression */
int fromIndex = 0;
int subExprStartParanthesis = 0;
int subExprStartSquareBrackets = 0;
ArrayList<String> subExprs = new ArrayList<String>();
again: while ((subExprStartParanthesis = expr.indexOf("(", fromIndex)) > -2
&& (subExprStartSquareBrackets = expr.indexOf("[", fromIndex)) > -2) {
/* Check the type of current bracket */
boolean isParanthesis = false;
if (subExprStartParanthesis == -1
&& subExprStartSquareBrackets == -1)
break;
else if (subExprStartParanthesis == -1)
isParanthesis = false;
else if (subExprStartSquareBrackets == -1)
isParanthesis = true;
else if (subExprStartParanthesis < subExprStartSquareBrackets)
isParanthesis = true;
/* Extract the sub expression */
fromIndex = isParanthesis ? subExprStartParanthesis
: subExprStartSquareBrackets;
int subExprEndParanthesis = 0;
int subExprEndSquareBrackets = 0;
if (isParanthesis) {
while ((subExprEndParanthesis = expr.indexOf(")", fromIndex)) != -1) {
subExprs.add(expr.substring(subExprStartParanthesis,
subExprEndParanthesis + 1));
fromIndex = subExprEndParanthesis + 1;
continue again;
}
} else {
while ((subExprEndSquareBrackets = expr.indexOf("]", fromIndex)) != -1) {
subExprs.add(expr.substring(subExprStartSquareBrackets,
subExprEndSquareBrackets + 1));
fromIndex = subExprEndSquareBrackets + 1;
continue again;
}
}
}
/* Logic only for printing */
System.out.println("Original expression : " + expr);
System.out.println();
System.out.print("Sub expressions : [ ");
for (String string : subExprs) {
System.out.print(string + ", ");
}
System.out.print("]");
String[] subExprsArray = {};
return subExprs.toArray(subExprsArray);
}
Sample Output :
Original expression : (a+b)+[5+6]+(57-6)-[a-b]+[c-d]
Sub expressions : [ (a+b), [5+6], (57-6), [a-b], [c-d], ]
Do suggest improvements in the code. :)
Upvotes: 1