edd
edd

Reputation: 21

Recognising EOF File Character in Antlr3 Lexer

I'm trying to parse some strings using ANTLR 3...they are to be enclosed in single quotation marks. Therefore, if the user doesn't pass an even number of quotation marks it runs all the way to the end of file as it assumes it's a massive string.

Is there a way to specify ANTLR to recognize the EOF character? I've tried '<EOF>' and '\\z' to now avail.

Upvotes: 2

Views: 4673

Answers (2)

Sylvain Lecorn&#233;
Sylvain Lecorn&#233;

Reputation: 677

For some reason EOF didn't work for me (am using antlr v4) An alternative is to handle the EOF at a upper level. For example if you define EOF as statement separator this way:

program     : statement+ ;
statement   : some_stuff NEWLINE;

You could replace with:

program     : (statement NEWLINE)* statement? ;
statement   : some_stuff;

Upvotes: 0

Bart Kiers
Bart Kiers

Reputation: 170188

To handle a single quoted string literal in ANTLR, you'd do something like this:

SingleQuotedString
  :  '\'' ('\\' ('\\' | '\'') | ~('\\' | '\'' | '\r' | '\n'))* '\''
  ;

meaning:

'\''                              # a single quote
(                                 # (
  '\\' ('\\' | '\'')              #   a backslash followed by \ or '
  |                               #   OR
  ~('\\' | '\'' | '\r' | '\n')    #   any char other than \, ', \r and \n
)*                                # ) zero or more times
'\''                              # a single quote

And to denote the end-of-file token inside ANTLR rules, simply use EOF:

parse
  :  SingleQuotedString+ EOF
  ;

which will match one or more SingleQuotedStrings, followed by the end of the file (EOF). The char '\z' is not a valid escape char inside ANTLR rules.

Upvotes: 1

Related Questions