Reputation: 31
The following regex has been validated on regex101 and works fine, matching either "()", or "[]" or "{}":
\(\)|\[]|\{}
However:
it's not so readable
in Java gets even less readable:
\\(\\)|\\[]|\\{}
but still works fine, as my test class shows.
Now I'd like to make it more readable by using Unicode (which should avoid escaping) and constants, defining it like this:
private static final String MATCH_OPENING_BRACE = "\u0028";
private static final String MATCH_CLOSING_BRACE = "\u0029";
private static final String MATCH_OPENING_SQUARE_BRACE = "\u005B";
private static final String MATCH_CLOSING_SQUARE_BRACE = "\u005D";
private static final String MATCH_OPENING_CURLY_BRACE = "\u007B";
private static final String MATCH_CLOSING_CURLY_BRACE = "\u007D";
private static final String MATCHING_OR_FLAG = "|";
private static final String COMPLETE_REGEX =
MATCH_OPENING_BRACE + MATCH_CLOSING_BRACE
+ MATCHING_OR_FLAG + MATCH_OPENING_SQUARE_BRACE + MATCH_CLOSING_SQUARE_BRACE
+ MATCHING_OR_FLAG + MATCH_OPENING_CURLY_BRACE + MATCH_CLOSING_CURLY_BRACE;
private static final String REGEX_REPLACEMENT = "";
so that I can write readable code like this:
@Override
public boolean isValid(String input) {
for (int i = input.length() / 2; i > 0; i--)
input = input.replaceAll(COMPLETE_REGEX, REGEX_REPLACEMENT);
return input.isEmpty();
}
instead of using that unreadable literal, like this:
@Override
public boolean isValid(String input) {
for (int i = input.length() / 2; i > 0; i--)
input = input.replaceAll("\\(\\)|\\[]|\\{}", "");
return input.isEmpty();
}
But here the following exception is thrown:
java.util.regex.PatternSyntaxException: Unclosed character class near index 7
()|[]|{}
^
I tried adding an escape char, like this:
private static final String MATCH_OPENING_CURLY_BRACE = "\\\u007B";
but that only gives a similar exception:
java.util.regex.PatternSyntaxException: Unclosed character class near index 8
()|[]|\{}
^
Any hints?
Upvotes: -7
Views: 83
Reputation: 31
as @user85421 mentioned, using unicodes won't make the escapes unrequired, as I though it would.
so, escaping (, ), [ and { is still required, here's a fix:
private static final String MATCH_OPENING_BRACE = "\\(";
private static final String MATCH_CLOSING_BRACE = "\\)";
private static final String MATCH_OPENING_SQUARE_BRACE = "\\[";
private static final String MATCH_CLOSING_SQUARE_BRACE = "]";
private static final String MATCH_OPENING_CURLY_BRACE = "\\{";
private static final String MATCH_CLOSING_CURLY_BRACE = "}";
the above works fine, just like the following, with all the meta-characters escaped (see @g00se comment below):
private static final String MATCH_OPENING_BRACE = "\\(";
private static final String MATCH_CLOSING_BRACE = "\\)";
private static final String MATCH_OPENING_SQUARE_BRACE = "\\[";
private static final String MATCH_CLOSING_SQUARE_BRACE = "\\]";
private static final String MATCH_OPENING_CURLY_BRACE = "\\{";
private static final String MATCH_CLOSING_CURLY_BRACE = "\\}";
indeed, the original regex was NOT escaping ALL the meta-characters, as you can see, and it still works fine:
input = input.replaceAll("\\(\\)|\\[]|\\{}", "");
indeed, all the tests runs fine also with all meta-characters escaped:
input = input.replaceAll("\\(\\)|\\[\\]|\\{\\}", "");
Upvotes: -1
Reputation: 29670
Maybe using comments and spaces to explain and format the expression:
String regex = """
(?x) # allows comments and ignore whitespace
\\(\\) # () escaped
| # or
\\[] # [] escaped
| # or
\\{} # {} escaped
""";
the formatting can be changed to your liking
drawback: relevant #
and spaces must also be escaped
For longer sequences Pattern#quote
can be used. Probably not so useful for small sequences (like ()
)
Upvotes: 3