Reputation: 71
I am in the process of adapting the System Verilog LRM into Antlr4. This is a huge overkill for what I really need, however. Basically I need dependency analysis similar to the -M
switch in gcc
. This problem has been surprisingly difficult to solve, and my current regex
based solution is incomplete, buggy and constantly breaks when exposed to new code, even though it has been patched many times. I have tried to use various freely available parsers, but none of them seem to handle code that conforms to the latest Systemverilog (2012) standard.
I think I need a parser based approach, and I think I am stuck building my own parser. But I am very interested to hear any other suggestions about this. I can't be the only one who has this problem.
Here is my Antlr question: I am attempting to use the "Island in the stream" approach where the Antlr grammar will ignore most of the details and complexity of the Systemverilog language and only parse code where modules are being instanced or headers are being referenced. Obviously the difficulty here is determining how to distinguish between code I care about and code I don't. Has anyone used Antlr this way (not necessarily for Systemverilog)? I am hoping to get a strategy about how to write the "catch all" rule that matches everything that is not related to module instances.
Thanks.
Upvotes: 1
Views: 633
Reputation: 6001
The idiomatic strategy is to match what is wanted and let everything else be consumed by an 'other' rule. So the basic structure of the parser will be:
verilog : statement+ EOF ;
statement : header
| module
| <<etc>>
| other
;
header : INCLUDE filePathspec SEMI ;
filePathspec: <<whatever>> ;
module : MODULE <<whatever>> SEMI ;
other : . ; // consume a single, uninteresting token at a time
The only requirement is to make the statement rules sufficiently detailed to uniquely match their statements. The Verilog syntax gives you that explicitly.
UPDATE
Take a look at the example Verilog grammar is in the grammar achieve.
Upvotes: 0