Reputation: 11559
I heavily use regular expressions to store business parameters in various tables (primarily business decision-tree logic). When thousands of business objects attempt to match themselves to these regex-driven parameters, it can be quite slow using the String.matches()
on various properties. So I created a class called MatchRegex which acts as a property type over a regex String, and it internally compiles the regex and resets the test input Strings.
public final class MatchRegex {
private final String regex;
private final Pattern pattern;
private final Matcher matcher;
private MatchRegex(String regex) {
this.regex = regex;
this.pattern = Pattern.compile(regex);
this.matcher = pattern.matcher("Hello");
}
public static MatchRegex of(String regex) {
return new MatchRegex(regex);
}
public boolean matches(String input) {
return matcher.reset(input).matches();
}
public String getRegex() {
return regex;
}
}
However, I'm a bit disturbed I randomly get an error that makes little sense to me unless I dig into the Pattern source code. It fails on the return matcher.reset(input).matches()
line. Is this a bug with the regex library? How do i fix it?
java.lang.StringIndexOutOfBoundsException: String index out of range: 7
at java.lang.String.charAt(Unknown Source)
at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
at java.util.regex.Pattern$GroupTail.match(Unknown Source)
at java.util.regex.Pattern$BranchConn.match(Unknown Source)
at java.util.regex.Pattern$Slice.match(Unknown Source)
at java.util.regex.Pattern$Branch.match(Unknown Source)
at java.util.regex.Pattern$GroupHead.match(Unknown Source)
at java.util.regex.Pattern$GroupTail.match(Unknown Source)
at java.util.regex.Pattern$BranchConn.match(Unknown Source)
at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
at java.util.regex.Pattern$Branch.match(Unknown Source)
at java.util.regex.Pattern$GroupHead.match(Unknown Source)
at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
at java.util.regex.Matcher.match(Unknown Source)
at java.util.regex.Matcher.matches(Unknown Source)
Upvotes: 1
Views: 844
Reputation: 11559
Just realized the same moment as Jon Skeet that Matcher is not threadsafe. I will need to use some thread localization or synchronization. Hopefully it will not cost too much in performance.
UPDATE I am guessing the most efficient strategy is to just invoke a new Matcher each time.
public final class MatchRegex {
private final String regex;
private final Pattern pattern;
private MatchRegex(String regex) {
this.regex = regex;
this.pattern = Pattern.compile(regex);
}
public static MatchRegex of(String regex) {
return new MatchRegex(regex);
}
public boolean matches(String input) {
return pattern.matcher(input).matches();
}
public String getRegex() {
return regex;
}
}
Upvotes: 1