josh-stackexchange
josh-stackexchange

Reputation: 175

Inheritance issues with std::any using antlr4 c++ visitors to construct ASTs

I am trying to use an inherited instance of the generated BaseVisitor class to construct an AST from the parse tree of the grammar I am using for a simple compiler.

Consider a subset of my grammar where Stat is a statement of a lightweight language:

...
program: stat;

stat: SkipStat                         # Skip
    | type Ident AssignEq assignRHS    # Declare
    | assignLHS AssignEq assignRHS     # Assign

...

My understanding (as per this post) is to have the visitor call visit(ctx->stat()) where ctx has type ProgramContext*. The derived visitor then correctly makes calls to the corresponding overridden visitSkip(..), visitDeclare(..), etc.

I have simple node classes for my AST, again a subset looks as follows:

struct BaseNode {};

struct Program : BaseNode {
    Program(std::shared_ptr<Stat> body) : body(std::move(body)) {}
    std::shared_ptr<Stat> body;
};

struct Stat : BaseNode {};

struct Assign : Stat {
    Assign(std::shared_ptr<AssignLHS> lhs, std::shared_ptr<AssignRHS> rhs) :
        lhs(std::move(lhs)),
        rhs(std::move(rhs)) {}

    std::shared_ptr<AssignLHS> lhs;
    std::shared_ptr<AssignRHS> rhs;
};

struct Declare : Stat {
    Declare(std::shared_ptr<Type> type, std::string name, std::shared_ptr<AssignRHS> rhs) :
        type(std::move(type)),
        name(std::move(name)),
        rhs(std::move(rhs)) {}

    std::shared_ptr<Type> type;
    std::string name;
    std::shared_ptr<AssignRHS> rhs;
};

struct Skip : Stat {};

Tying the two points together, I am trying to have the mentioned visitSkip(..), visitDeclare(..), etc. (which are all of type std::any) to return std::shared_ptr<Skip>, std::shared_ptr<Declare>, etc. such that visitProgram(..) can receive them from a call to visit in the form

std::shared_ptr<Stat> stat = std::any_cast<std::shared_ptr<Stat>>visit(ctx->stat());

However (!), std::any only allows casts with the exact known class and not any derived class, so this approach does not work. Instead, I have started to create my own visitor entirely separate from (i.e. not a child of) the generated visitor.

I assume there is a better solution using the generated classes that I am missing.

Having found this answer to a not dissimilar post, is it even worth constructing an AST? If my idea of how to use antlr4 is inaccurate, please let me know and point me towards a good source I can start from. Thanks.

Edit: As per chapter 7 of the definitive antlr 4 reference, I believe I could achieve what I desire through the use of a stack holding BaseNode* and casting the popped nodes appropriately. This does not seem like the best solution. Ideally, I would like to achieve something similar to the java implementation method wherein we pass an expected return type into the visitor classes.

Edit 2: I have now implemented such a solution (making use of a listener instead of visitors), example below from the exitAssign(..) function:

void Listener::exitAssign(Parser::AssignContext* ctx) {
    const auto rhs = std::static_pointer_cast<AssignRHS>(m_stack.top());
    m_stack.pop();

    const auto lhs = std::static_pointer_cast<AssignLHS>(m_stack.top());
    m_stack.pop();

    m_stack.push(std::make_shared<Assign>(lhs, rhs));
}

I still feel this solution is not the best - it feels very hacky as order of arguments must be popped in reverse, and it is easy to forget to push onto the stack after creating the AST node.

I will use this implementation for now, but again, if a better method is preferred by people who use antlr 4 in c++, please do let me know.

Upvotes: 0

Views: 353

Answers (1)

Kaktuts
Kaktuts

Reputation: 86

To parse an arithmetic expression, I preferred to use a listener with an overload of "exit" methods:

class MyListener final : public FormulaBaseListener {
public:
   void exitUnaryOp(FormulaParser::UnaryOpContext *ctx) override;
   void exitLiteral(FormulaParser::LiteralContext *ctx) override;
   void exitCell(FormulaParser::CellContext *ctx) override;
   void exitBinaryOp(FormulaParser::BinaryOpContext *ctx) override;

   std::vector<astToken> getResult();
};

int main(){
     antlr4::ANTLRInputStream input(".........");
     std::unique_ptr<FormulaLexer> up_fl;
     up_fl = std::make_unique<FormulaLexer>(&input);
     FormulaLexer& fl = *up_fl;
     BailErrorListener error_listener; //custom : public antlr4::BaseErrorListener with syntaxError override
     fl.removeErrorListeners();
     fl.addErrorListener(&error_listener);
     antlr4::CommonTokenStream tokens(&fl);
     std::unique_ptr<FormulaParser> up_parser;
     up_parser = std::make_unique<FormulaParser>(&tokens);
     FormulaParser& parser = *up_parser;
     auto error_handler = std::make_shared<antlr4::BailErrorStrategy>();
     parser.setErrorHandler(error_handler);
     parser.removeErrorListeners();
     FormulaParser::MainContext* tree;
     tree = parser.main();
     MyListener listener; //custom final : public FormulaBaseListener with void exit_*_(FormulaParser::_*_Context *ctx) override;
     antlr4::tree::ParseTreeWalker::DEFAULT.walk(&listener, tree);
     asttree_ = listener.getResult(); //get what you want

}

Upvotes: 0

Related Questions