ANTLR4 Specific search

Question

Basically I want to find, in a file, using ANTLR, every expression as defined :

WORD.WORD

for example : "end.beginning" matches

For the time being the file can have hundreds and hundreds of lines and a complexe structure.

Is there a way to skip every thing (character?) that does not match with the pattern described above, without making a grammar that fully represents the file ?

So far this is my grammar but i don't know what to do next.

 grammar Dep;

program
    : 
       dependencies
    ;

dependencies
    :
      (
       dependency
      )*
    ;

dependency
    :
      identifier
      DOT
      identifier
    ;

identifier
    :
      INDENTIFIER
    ;

DOT     : '.'       ;

INDENTIFIER
  :
    [a-zA-Z_] [a-zA-Z0-9_]*
  ;

OTHER
  :
    . -> skip
  ;

Bart Kiers · Accepted Answer

The way you're doing it now, the dependency rule would also match the tokens 'end', '.', 'beginning' from the input:

end
#####
.
#####
beginning

because the line breaks and '#'s are being skipped from the token stream.

If that is not what you want, i.e. you'd like to match "end.beginning" without any char in between, you should make a single lexer rule of it, and match that rule in your parser:

grammar Dep;

program
 : DEPENDENCY* EOF
 ;

DEPENDENCY
 : [a-zA-Z_] [a-zA-Z0-9_]* '.' [a-zA-Z_] [a-zA-Z0-9_]*
 ;

OTHER
 : . -> skip
 ;

Then you could use a tree listener to do something useful with your DEPENDENCY's:

public class Main {

  public static void main(String[] args) throws Exception {

    String input = "### end.beginning ### end ### foo.bar mu foo.x";
    DepLexer lexer = new DepLexer(new ANTLRInputStream(input));
    DepParser parser = new DepParser(new CommonTokenStream(lexer));

    ParseTreeWalker.DEFAULT.walk(new DepBaseListener(){
      @Override
      public void enterProgram(@NotNull DepParser.ProgramContext ctx) {
        for (TerminalNode node : ctx.DEPENDENCY()) {
          System.out.println("node=" + node.getText());
        }
      }
    }, parser.program());
  }
}

which would print:

node=end.beginning
node=foo.bar
node=foo.x

ANTLR4 Specific search

Answers (1)

Related Questions