Reputation: 27
I want to match alphanumeric words separated by the operators, +, -, *, /, <, > and ending with a semicolon. There can be whitespace characters in between e.g. the following strings should return true
:
first + second;
first - second;
first * second;
first / second;
first;
first + second < third;
third < second * first;
This is what I have tried:
public boolean isExpr(String line) {
// factor = ([A-Za-Z]+|[0-9]+) for example: aasdaa or 23131 or xyz or 1 or a
// simple-expr = (factor {mulop factor} {addop factor {mulop factor}})
// expr = simple-expr compop simple-expr | simple-expr
String factor = new String("([A-Za-Z]+|[0-9]+)");
String mulOp = new String("(\\*|\\/)"); // '*'' or '/'
String addOp = new String("(\\+|\\-)"); // '+' or '-'
String compOp = new String("(\\<|\\="); // '<' or '='
String simpleExpr = new String("(" + factor + " (" + mulOp + " " + factor + ")? (" + addOp + " " + factor + " (" + mulOp + " " + factor + ")?)?");
String expr = new String("(" + simpleExpr + " " + compOp + " " + simpleExpr + ")|" + simpleExpr);
System.out.println(line.matches(expr));
return line.matches(expr);
}
What is wrong with that code and how can I solve it?
I got the below error on executing my code:
Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal character range near index 9
((([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-Z]+|[0-9]+))? ((\+|\-) ([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-Z]+|[0-9]+))?)? (\<|\= (([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-eZ]+|[0-9]+))? ((\+|\-) ([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-Z]+|[0-9]+))?)?)|(([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-Z]+|[0
-9]+))? ((\+|\-) ([A-Za-Z]+|[0-9]+) ((\*|\/) ([A-Za-Z]+|[0-9]+))?)?
Upvotes: 0
Views: 1041
Reputation: 79425
I suggest you, instead of using unnecessarily complex and error-prone logic, simply use the regex, [A-Za-z0-9]+(?:\s*[\/*+\-<>]\s*[A-Za-z0-9]+\s*)*;
which covers all the example strings you have posted in the question.
Explanation of the regex:
[A-Za-z0-9]+
: 1+ alphabets or digits(?:
: Open non-capturing group
\s*
: 0+ whitespace characters[\/*+\-<>]
: One of /
, *
, +
, -
, <
, >
\s*
: 0+ whitespace characters[A-Za-z0-9]+
: 1+ alphabets or digits\s*
: 0+ whitespace characters)
: Close non-capturing group*
: Quantifier to make the non-capturing group match 0+ times;
: The charcter literal, ;
Demo:
import java.util.stream.Stream;
public class Main {
public static void main(String[] args) {
// Test
Stream.of(
"first + second;",
"first * second;",
"first - second;",
"first / second;",
"first;",
"first + second < third;",
"third < second * first;"
).forEach(s -> System.out.println(isExpr(s)));
}
public static boolean isExpr(String line) {
return line.matches("[A-Za-z0-9]+(?:\\s*[\\/*+\\-<>]\\s*[A-Za-z0-9]+\\s*)*;");
}
}
Output:
true
true
true
true
true
true
true
Because of the unnecessarily complex logic that you have implemented, one or more of the parentheses in the final regex have not been closed. In addition to that, I can see at least one part where the parenthesis has not been closed e.g.
String compOp = new String("(\\<|\\="); // '<' or '='
It should be
String compOp = new String("(\\<|\\=)"); // '<' or '='
//----------------------------------^
Apart from this, given below are a couple of more things that you should learn/address:
String factor = "[A-Za-z]+|[0-9]+";
String mulOp = "\\*|\\/"; // '*'' or '/'
String addOp = "\\+|\\-"; // '+' or '-'
String compOp = "\\<|\\="; // '<' or '='
a-Z
to a-z
.Upvotes: 6