Reputation: 147
Basically I want to find, in a file, using ANTLR, every expression as defined :
WORD.WORD
for example : "end.beginning" matches
For the time being the file can have hundreds and hundreds of lines and a complexe structure.
Is there a way to skip every thing (character?) that does not match with the pattern described above, without making a grammar that fully represents the file ?
So far this is my grammar but i don't know what to do next.
grammar Dep;
program
:
dependencies
;
dependencies
:
(
dependency
)*
;
dependency
:
identifier
DOT
identifier
;
identifier
:
INDENTIFIER
;
DOT : '.' ;
INDENTIFIER
:
[a-zA-Z_] [a-zA-Z0-9_]*
;
OTHER
:
. -> skip
;
Upvotes: 1
Views: 277
Reputation: 170308
The way you're doing it now, the dependency
rule would also match the tokens 'end'
, '.'
, 'beginning'
from the input:
end
#####
.
#####
beginning
because the line breaks and '#'
s are being skipped from the token stream.
If that is not what you want, i.e. you'd like to match "end.beginning"
without any char in between, you should make a single lexer rule of it, and match that rule in your parser:
grammar Dep;
program
: DEPENDENCY* EOF
;
DEPENDENCY
: [a-zA-Z_] [a-zA-Z0-9_]* '.' [a-zA-Z_] [a-zA-Z0-9_]*
;
OTHER
: . -> skip
;
Then you could use a tree listener to do something useful with your DEPENDENCY
's:
public class Main {
public static void main(String[] args) throws Exception {
String input = "### end.beginning ### end ### foo.bar mu foo.x";
DepLexer lexer = new DepLexer(new ANTLRInputStream(input));
DepParser parser = new DepParser(new CommonTokenStream(lexer));
ParseTreeWalker.DEFAULT.walk(new DepBaseListener(){
@Override
public void enterProgram(@NotNull DepParser.ProgramContext ctx) {
for (TerminalNode node : ctx.DEPENDENCY()) {
System.out.println("node=" + node.getText());
}
}
}, parser.program());
}
}
which would print:
node=end.beginning node=foo.bar node=foo.x
Upvotes: 1