Reputation: 2528
I'm working through "The Definitive ANTLR4 Reference" by T. Parr, and got to chapter 4 and am working through the "Building a Calculator Using a Visitor' section. I've gotten the code in the book to work (building a Java based system), and am able to reproduce the output given in the book.
The grammar file being used is:
grammar calc;
prog: stat* ;
stat: expr NEWLINE # printExpr
| ID '=' expr NEWLINE # assign
| NEWLINE # blank
;
expr: expr op=('*'|'/') expr # MulDiv
| expr op=('+'|'-') expr # AddSub
| INT # int
| ID # id
| '(' expr ')' # parens
;
MUL : '*' ; // assigns token name to '*' used above in grammar
DIV : '/' ;
ADD : '+' ;
SUB : '-' ;
ID : [a-zA-Z]+ ; // match identifiers
INT : [0-9]+ ; // match integers
NEWLINE : '\r'? '\n' ; // return newlines to parser (is end-statement signal
)
WS : [ \t]+ -> skip ; // toss out whitespace
However, my application is written in C++, so I successfully installed the C++-runtime that corresponds to the version of ANTLR that I am using (I have antlr-4.13.1 and antlr-cpp-runtime-4.13.1 installed). I've build the c++-runtime from source, and all test in the test suite pass.
The C-++ files were produced by running the command, which produced no errors or warnings
antlr4 -Dlanguage=Cpp -no-listener -visitor -encoding UTF8 calc.g4
Following several source on the internet, I've created the following C++ main:
int main(int argc, char** argv)
{
std::string inputFile = "";
std::ifstream ins;
if(argc > 2) inputFile = argv[1];
ins.open(inputFile);
ANTLRInputStream input(ins);
calcLexer lexer(&input);
CommonTokenStream tokens(&lexer);
//tokens.fill();
//for(auto token: tokens.getTokens())
//{
// std::cout << token->toString() << std::endl;
//}
calcParser parser(&tokens);
std::cout << parser.expr()->toStringTree() << std::endl;
auto tree = parser.prog();
evalVisitor visitor = evalVisitor();
visitor.visitProg(tree);
return 0;
}
I've used the same grammar file and input file as used in the above book reference, and would expect the same output as the Java version, however after compiling the above file with the appropriate visitor and running the resulting application with the sample file, I get:
$ ./calc ../sample.expr
line 1:0 mismatched input '<EOF>' expecting {'(', ID, INT}
Searching on the internet, it seems that this could be related to a rare, but known bug (see the accepted answer to this question: ANTLR mismatched input '<EOF>'). Given that the grammar file works correctly in the Java version, I was suspicious of the solution, but I tried it with no luck.
In an attempt to further debug this, I added the lines that are commented out in the above code to get a listing of the tokens discovered and got this error message:
$ ./calc ../sample.expr
[@0,0:-1='<EOF>',<-1>,1:0]
line 1:0 mismatched input '<EOF>' expecting {'(', ID, INT}
As a check to make sure that a weird encoding of the input file is not going out, I dumped the contents using xxd
, and got
$ xxd ../sample.expr
00000000: 3139 380a 6120 3d20 350a 6220 3d20 360a 198.a = 5.b = 6.
00000010: 612b 622a 320a 2831 2b32 292a 330a a+b*2.(1+2)*3.
which is the expect output, with no indication of any strange encodings.
I'm at a loss of what to try next, and suggestions would be helpful.
N.B. For the record, here are the interface and implementation for the visitor. I do not think that they are an issue, as the problem manifests itself prior to the instantiation of the visitor.
evalVisitor.h:
#ifndef _calcVisitorImpl_h_
#define _calcVisitorImpl_h_
#include "calcVisitor.h"
#include "calcBaseVisitor.h"
class evalVisitor : public calcBaseVisitor
{
public:
evalVisitor() : calcBaseVisitor() { }
~evalVisitor() { }
//antlrcpp::Any visitProg(calcParser::ProgContext *context);
antlrcpp::Any visitPrintExpr(calcParser::PrintExprContext *context);
antlrcpp::Any visitAssign(calcParser::AssignContext *context);
//antlrcpp::Any visitBlank(calcParser::BlankContext *context);
antlrcpp::Any visitParens(calcParser::ParensContext *context);
antlrcpp::Any visitMulDiv(calcParser::MulDivContext *context);
antlrcpp::Any visitAddSub(calcParser::AddSubContext *context);
antlrcpp::Any visitId(calcParser::IdContext *context);
antlrcpp::Any visitInt(calcParser::IntContext *context);
private:
std::map<std::string, int> memory;
};
#endif
evalVisitor.cpp:
#include "evalVisitor.h"
#include <iostream>
antlrcpp::Any evalVisitor::visitPrintExpr(calcParser::PrintExprContext *context)
{
std::cerr << "[?] in evalVisitor::visitPrintExpt" << std::endl;
int value = std::any_cast<int>(visit(context->expr()));
std::cout << value << std::endl;
return 0;
}
antlrcpp::Any evalVisitor::visitAssign(calcParser::AssignContext *context)
{
std::cerr << "[?] in evalVisitor::visitAssign" << std::endl;
std::string id = context->ID()->getText();
int value = std::any_cast<int>(visit(context->expr()));
memory.insert(std::pair<std::string, int>(id, value));
return 0;
}
antlrcpp::Any evalVisitor::visitParens(calcParser::ParensContext *context)
{
std::cerr << "[?] in evalVisitor::visitParens" << std::endl;
return visit(context->expr());
return 0;
}
antlrcpp::Any evalVisitor::visitMulDiv(calcParser::MulDivContext *context)
{
std::cerr << "[?] in evalVisitor::visitMulDiv" << std::endl;
int left = std::any_cast<int>(visit(context->expr(0)));
int right = std::any_cast<int>(visit(context->expr(1)));
if(context->op->getType() == calcParser::MUL) return left * right;
else return left / right;
return 0;
}
antlrcpp::Any evalVisitor::visitAddSub(calcParser::AddSubContext *context)
{
std::cerr << "[?] in evalVisitor::visitAddSub" << std::endl;
int left = std::any_cast<int>(visit(context->expr(0)));
int right = std::any_cast<int>(visit(context->expr(1)));
if(context->op->getType() == calcParser::ADD) return left + right;
else return left - right;
return 0;
}
antlrcpp::Any evalVisitor::visitId(calcParser::IdContext *context)
{
std::map<std::string, int>::iterator iter;
std::cerr << "[?] in evalVisitor::visitId" << std::endl;
std::string id = context->ID()->getText();
if(memory.end() != (iter = memory.find(id))) return (*iter).second;
return 0;
}
antlrcpp::Any evalVisitor::visitInt(calcParser::IntContext *context)
{
std::cerr << "[?] in evalVisitor::visitInt" << std::endl;
return atoi(context->INT()->getText().c_str());
}
Upvotes: 0
Views: 86
Reputation: 170278
This looks to be the same issue as: Why does my grammar reports EOF error when I don't have this token?
std::cout << parser.expr()->toStringTree() << std::endl; auto tree = parser.prog();
You did not show the input you're trying to parse, but you're instructing the parser to parse an expr
first, and then telling it to parse prog
.
If the input is 1 + 2
, the expr
will consume the entire tokens stream except the built-in EOF
token. After that, you're trying the let the match the prog
with only the EOF
remaining (hence the error "mismatched input ''")
Upvotes: 0