Reputation: 79

antlr4 performance on a multiple-core CPU

Recently, I encounter a performance problem with my program. Investigation finally points to an issue deep inside in antlr4 which I use to parse SQL. As shows in the code, there is a synchronized block on dfa.states. That block literally caps the parsing performance on a computer with 8 or more cores. I am wondering if anyone has run into this and found a solution?

protected DFAState addDFAState(ATNConfigSet configs) {
    /* the lexer evaluates predicates on-the-fly; by this point configs
     * should not contain any configurations with unevaluated predicates.
     */
    assert !configs.hasSemanticContext;

    DFAState proposed = new DFAState(configs);
    ATNConfig firstConfigWithRuleStopState = null;
    for (ATNConfig c : configs) {
        if ( c.state instanceof RuleStopState ) {
            firstConfigWithRuleStopState = c;
            break;
        }
    }

    if ( firstConfigWithRuleStopState!=null ) {
        proposed.isAcceptState = true;
        proposed.lexerActionExecutor = ((LexerATNConfig)firstConfigWithRuleStopState).getLexerActionExecutor();
        proposed.prediction = atn.ruleToTokenType[firstConfigWithRuleStopState.state.ruleIndex];
    }

    DFA dfa = decisionToDFA[mode];
    synchronized (dfa.states) {
        DFAState existing = dfa.states.get(proposed);
        if ( existing!=null ) return existing;

        DFAState newState = proposed;

        newState.stateNumber = dfa.states.size();
        configs.setReadonly(true);
        newState.configs = configs;
        dfa.states.put(newState, newState);
        return newState;
    }
}

Upvotes: 1

Answers (2)

Water Guo

Reputation: 79

After a few days of struggle, I am able to find a solution. Just like Mike Lische said, the synchronized block seems trying to reduce memory footprint. But it has a significant impact to performance on a multi-core computer with heavy SQL parsing workload. I was trying to parse a 100gb+ SQL file generated by mysqldump.

My solution is to create a custom Interpreter with a cloned DFA instead of the static one. The result is almost 10 times better on my 16 core AMD threadripper with CPU usage goes above 95%.

setInterpreter(new LexerATNSimulator(this, _ATN, getDFA(), new PredictionContextCache()));

private DFA[] getDFA() {
    DFA[] result = new DFA[_ATN.getNumberOfDecisions()];
    for (int i = 0; i < _ATN.getNumberOfDecisions(); i++) {
        result[i] = new DFA(_ATN.getDecisionState(i), i);
    }
    return result;
}

Upvotes: 1

Mike Lischke

Reputation: 53337

All parsers instances for a given language share the same DFA (it's a static structure) for memory efficiency reasons. However, that requires to make this structure thread safe (parsers can be used in background threads). No way around that.

Upvotes: 0

antlr4 performance on a multiple-core CPU

Answers (2)

Related Questions