Reputation: 23
I am using an ANTLRv4-generated parser to process a large amount of files on a machine with multiple cores. To gain some extra speed, I would like to process files in parallel.
To check if parser performance is CPU bound, I split the files into groups and parsed them using independent processes each running the same parser in a dedicated JVM instance. This increased performance drastically.
This encouraged me to try the same using multiple threads instead of processes, however, without success. I created two worker threads, each with its own instance of parser, lexer and file-stream. The results returned are correct, however, using two threads actually takes slightly longer than using one.
To ensure that I am using threads correctly and that there is no problem with the JVM installation, I temporarily replaced the parsing code with code, which calculates Fibonacci-sequences: in that case, using multiple threads lead to a performance-increase.
Analyzing this behavior, I found that when using multiple parsing-threads, none of the CPUs reach high usage. It looks like the threads are fighting over some shared resource.
Taking a look at the ANTLR source code, I found the following comment in ParserATNSimulator.java:
"All instances of the same parser share the same decision DFAs through a static field. Each instance gets its own ATN simulator but they share the same decisionToDFA field. They also share a PredictionContextCache object that makes sure that all PredictionContext objects are shared among the DFA states. This makes a big size difference."
I am wondering whether synchronized access to these shared resources is causing the performance problems. If so, is there the possibility of creating unique instances of these resources instead? Or is there maybe even a much simpler solution to the problem?
Thanks in advance!
Fabian
Upvotes: 2
Views: 1575
Reputation: 99869
The reference version of the ANTLR 4 runtime is designed to be safe when using multiple parser threads (so long as multiple parser instances are used). I maintain an alternate (unofficial) branch of ANTLR 4 which implements the core algorithms in a different way to improve performance in multithreaded scenarios.
This branch exposes a slightly different API in some areas, so it's not a drop-in replacement for the 4.0 release of ANTLR 4.
https://github.com/sharwell/antlr4/tree/optimized
Upvotes: 5