Reputation: 175
I am trying to use an inherited instance of the generated BaseVisitor
class to construct an AST from the parse tree of the grammar I am using for a simple compiler.
Consider a subset of my grammar where Stat
is a statement of a lightweight language:
...
program: stat;
stat: SkipStat # Skip
| type Ident AssignEq assignRHS # Declare
| assignLHS AssignEq assignRHS # Assign
...
My understanding (as per this post) is to have the visitor call visit(ctx->stat())
where ctx
has type ProgramContext*
. The derived visitor then correctly makes calls to the corresponding overridden visitSkip(..)
, visitDeclare(..)
, etc.
I have simple node classes for my AST, again a subset looks as follows:
struct BaseNode {};
struct Program : BaseNode {
Program(std::shared_ptr<Stat> body) : body(std::move(body)) {}
std::shared_ptr<Stat> body;
};
struct Stat : BaseNode {};
struct Assign : Stat {
Assign(std::shared_ptr<AssignLHS> lhs, std::shared_ptr<AssignRHS> rhs) :
lhs(std::move(lhs)),
rhs(std::move(rhs)) {}
std::shared_ptr<AssignLHS> lhs;
std::shared_ptr<AssignRHS> rhs;
};
struct Declare : Stat {
Declare(std::shared_ptr<Type> type, std::string name, std::shared_ptr<AssignRHS> rhs) :
type(std::move(type)),
name(std::move(name)),
rhs(std::move(rhs)) {}
std::shared_ptr<Type> type;
std::string name;
std::shared_ptr<AssignRHS> rhs;
};
struct Skip : Stat {};
Tying the two points together, I am trying to have the mentioned visitSkip(..)
, visitDeclare(..)
, etc. (which are all of type std::any
) to return std::shared_ptr<Skip>
, std::shared_ptr<Declare>
, etc. such that visitProgram(..)
can receive them from a call to visit
in the form
std::shared_ptr<Stat> stat = std::any_cast<std::shared_ptr<Stat>>visit(ctx->stat());
However (!), std::any
only allows casts with the exact known class and not any derived class, so this approach does not work. Instead, I have started to create my own visitor entirely separate from (i.e. not a child of) the generated visitor.
I assume there is a better solution using the generated classes that I am missing.
Having found this answer to a not dissimilar post, is it even worth constructing an AST? If my idea of how to use antlr4 is inaccurate, please let me know and point me towards a good source I can start from. Thanks.
Edit: As per chapter 7 of the definitive antlr 4 reference, I believe I could achieve what I desire through the use of a stack holding BaseNode*
and casting the popped nodes appropriately. This does not seem like the best solution. Ideally, I would like to achieve something similar to the java implementation method wherein we pass an expected return type into the visitor classes.
Edit 2: I have now implemented such a solution (making use of a listener instead of visitors), example below from the exitAssign(..)
function:
void Listener::exitAssign(Parser::AssignContext* ctx) {
const auto rhs = std::static_pointer_cast<AssignRHS>(m_stack.top());
m_stack.pop();
const auto lhs = std::static_pointer_cast<AssignLHS>(m_stack.top());
m_stack.pop();
m_stack.push(std::make_shared<Assign>(lhs, rhs));
}
I still feel this solution is not the best - it feels very hacky as order of arguments must be popped in reverse, and it is easy to forget to push onto the stack after creating the AST node.
I will use this implementation for now, but again, if a better method is preferred by people who use antlr 4 in c++, please do let me know.
Upvotes: 0
Views: 353
Reputation: 86
To parse an arithmetic expression, I preferred to use a listener with an overload of "exit" methods:
class MyListener final : public FormulaBaseListener {
public:
void exitUnaryOp(FormulaParser::UnaryOpContext *ctx) override;
void exitLiteral(FormulaParser::LiteralContext *ctx) override;
void exitCell(FormulaParser::CellContext *ctx) override;
void exitBinaryOp(FormulaParser::BinaryOpContext *ctx) override;
std::vector<astToken> getResult();
};
int main(){
antlr4::ANTLRInputStream input(".........");
std::unique_ptr<FormulaLexer> up_fl;
up_fl = std::make_unique<FormulaLexer>(&input);
FormulaLexer& fl = *up_fl;
BailErrorListener error_listener; //custom : public antlr4::BaseErrorListener with syntaxError override
fl.removeErrorListeners();
fl.addErrorListener(&error_listener);
antlr4::CommonTokenStream tokens(&fl);
std::unique_ptr<FormulaParser> up_parser;
up_parser = std::make_unique<FormulaParser>(&tokens);
FormulaParser& parser = *up_parser;
auto error_handler = std::make_shared<antlr4::BailErrorStrategy>();
parser.setErrorHandler(error_handler);
parser.removeErrorListeners();
FormulaParser::MainContext* tree;
tree = parser.main();
MyListener listener; //custom final : public FormulaBaseListener with void exit_*_(FormulaParser::_*_Context *ctx) override;
antlr4::tree::ParseTreeWalker::DEFAULT.walk(&listener, tree);
asttree_ = listener.getResult(); //get what you want
}
Upvotes: 0