tukimin
tukimin

Reputation: 53

how to track catastrophic backtracking in your regular expressions?

I am using Apache regexp as library to validate regex. I want to know how to track if some regex is causing catastrophic backtracking. What I want to know is, is there some trick to catch which regex and the string value that causing catastrophic backtracking? I tried a little modify in class RE.java, but not as expected.

This my modification:

    public RE(String pattern) throws RESyntaxException
{
    this(pattern, MATCH_NORMAL);
    paramString = pattern;
}


public RE(String pattern, int matchFlags) throws RESyntaxException
{
    this(new RECompiler().compile(pattern), matchFlags);
    paramString = pattern;
}


int callcounterMN = 0;
protected int matchNodes(int firstNode, int lastNode, int idxStart)
{
    callcounterMN++;
    if (callcounterMN == 100) {
        try {
            String pc1 = new Exception().getStackTrace()[5].getClassName();
            if (pc1.indexOf("UpdateWebForm") > 1)     
                System.out.println("regex loop reach "+callcounterMN+"  with regex : "+paramString+" "+this.search.substring(0));
        } catch (Exception e) {}
    }

Upvotes: 3

Views: 1432

Answers (1)

Tao
Tao

Reputation: 14006

Much later, but given that there is still no answer I might as well pitch in: Google's RE2 regex library aims to prevent catastrophic backtracking issues altogether, sometimes at the cost of some level of performance: https://github.com/google/re2/wiki/WhyRE2

This is not exactly an answer to your question, in that this is about accepting any regex and ensuring that it never causes catastrophic backtracking hangs, rather than detecting those that will using the Apache library, but hopefully it's still useful input to some proportion of visitors to this question. If you can afford to not support some regex pattern constructs, you can afford the performance hit in some cases, and you can afford to test and swap out the library you are using for this one - then you might have solved the problem.

Upvotes: 1

Related Questions