tmn
tmn

Reputation: 11559

Java Compiled Regex Error

I heavily use regular expressions to store business parameters in various tables (primarily business decision-tree logic). When thousands of business objects attempt to match themselves to these regex-driven parameters, it can be quite slow using the String.matches() on various properties. So I created a class called MatchRegex which acts as a property type over a regex String, and it internally compiles the regex and resets the test input Strings.

public final class MatchRegex {
    private final String regex;
    private final Pattern pattern;
    private final Matcher matcher;

    private MatchRegex(String regex) { 
        this.regex = regex;
        this.pattern = Pattern.compile(regex);
        this.matcher = pattern.matcher("Hello");
    }
    public static MatchRegex of(String regex) { 
        return new MatchRegex(regex);
    }
    public boolean matches(String input) { 
        return matcher.reset(input).matches();
    }
    public String getRegex() { 
        return regex;
    }
}

However, I'm a bit disturbed I randomly get an error that makes little sense to me unless I dig into the Pattern source code. It fails on the return matcher.reset(input).matches() line. Is this a bug with the regex library? How do i fix it?

java.lang.StringIndexOutOfBoundsException: String index out of range: 7
    at java.lang.String.charAt(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Pattern$GroupTail.match(Unknown Source)
    at java.util.regex.Pattern$BranchConn.match(Unknown Source)
    at java.util.regex.Pattern$Slice.match(Unknown Source)
    at java.util.regex.Pattern$Branch.match(Unknown Source)
    at java.util.regex.Pattern$GroupHead.match(Unknown Source)
    at java.util.regex.Pattern$GroupTail.match(Unknown Source)
    at java.util.regex.Pattern$BranchConn.match(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Pattern$Branch.match(Unknown Source)
    at java.util.regex.Pattern$GroupHead.match(Unknown Source)
    at java.util.regex.Pattern$BmpCharProperty.match(Unknown Source)
    at java.util.regex.Matcher.match(Unknown Source)
    at java.util.regex.Matcher.matches(Unknown Source)

Upvotes: 1

Views: 844

Answers (1)

tmn
tmn

Reputation: 11559

Just realized the same moment as Jon Skeet that Matcher is not threadsafe. I will need to use some thread localization or synchronization. Hopefully it will not cost too much in performance.

UPDATE I am guessing the most efficient strategy is to just invoke a new Matcher each time.

public final class MatchRegex {
    private final String regex;
    private final Pattern pattern;

    private MatchRegex(String regex) { 
        this.regex = regex;
        this.pattern = Pattern.compile(regex);
    }
    public static MatchRegex of(String regex) { 
        return new MatchRegex(regex);
    }
    public boolean matches(String input) { 
        return pattern.matcher(input).matches();
    }
    public String getRegex() { 
        return regex;
    }
}

Upvotes: 1

Related Questions