Reputation: 164
For a parser I am actually implementing I partially have these private functions within the parser:
Parser private methods:
Token const* current_token() const;
Token const* next_token();
Token const* peek_token();
std::unique_ptr<ast::Expression> parse_expression();
std::unique_ptr<ast::TypeSpecifier> parse_type_specifier();
std::unique_ptr<ast::VariableDeclarationStatement> parse_variable_declaration();
std::unique_ptr<ast::Statement> parse_function_definition();
std::unique_ptr<ast::Statement> parse_top_level_statement();
the implementation of the parse_variable_declaration method is this:
parse_variable_declaration():
std::unique_ptr<ast::VariableDeclarationStatement> Parser::parse_variable_declaration() {
next_token(); // consume 'var'
if (current_token()->get_type() != TokenTypes::identifier) {
throw parser_error(current_token(), "", "expected identifier\n");
}
const auto id = current_token(); // store identifier
next_token(); // consume identifier
std::unique_ptr<ast::TypeSpecifier> type;
std::unique_ptr<ast::Expression> expr;
auto assignment_required = true;
if (current_token()->get_type() == TokenTypes::op_colon) { // optional type specifier
next_token(); // consume ':'
type = parse_type_specifier();
assignment_required = false;
}
if (assignment_required && current_token()->get_type() != TokenTypes::op_equals) {
throw parser_error(current_token(), "", "expected equals operator\n");
}
if (current_token()->get_type() == TokenTypes::op_equals) {
next_token(); // consume '='
expr = parse_expression();
}
if (current_token()->get_type() != TokenTypes::op_semi_colon) {
throw parser_error(current_token(), "", "expected semi-colon\n");
}
next_token(); // consume ';'
DEBUG_STDERR("parsed: variable_declaration_statement\n");
return std::make_unique<ast::VariableDeclarationStatement>(
id->get_string(), std::move(type), std::move(expr));
}
the last line (the return) ends in a segmentation fault. it basically calls the constructor of VariableDeclarationStatement:
VariableDeclarationStatement ctor:
VariableDeclarationStatement::VariableDeclarationStatement(
std::string const& name,
std::unique_ptr<TypeSpecifier> type_specifier,
std::unique_ptr<Expression> expr
):
m_name{name},
m_type_specifier{std::move(type_specifier)},
m_expr{std::move(expr)}
{}
I am debugging this things since yesterday and can't seem to find out why this does not work as intended. I want to build the Abstract Syntax Tree (parser output) with unique pointers to their child nodes (because they are the only owner of their childs which makes sense) - this is why I am try-harding to work with them.
Console output: DEBUG_STDERR
parsed: primitive_type_int // from parse_type_specifier()
parsed: integral_expression // from parse_expression()
parsed: variable_declaration_statement
[1] 12638 segmentation fault (core dumped) ./cion_compiler
Upvotes: 1
Views: 1890
Reputation: 164
As you guys correctly suggested the error was hidden in the suspicious id pointer. The parser in my program receives Tokens via unique_ptr from the lexer and stores them right as the current token. Therefore the method current_token() returned a pointer to a unique_ptr which gets removed as soon as the next call to next_token() takes place. Storing the invalid pointer to the already removed Token in id caused the problem.
I fixed the code in several different ways.
First I changed the return types from the helper methods above from "Token const*" to "Token const&" and the id variable now only copies the get_string value and does no other pointer related operations.
With these changes the segmentation fault problem was successfully solved! =)
Upvotes: 0
Reputation: 18572
The move operations on unique pointers basically boil down to simple pointer copies. There is no reason why any implementation of unique_ptr
would dereference the pointers in the process of moving them. Therefore, the likelihood that this operation is responsible for the seg-fault is virtually zero.
In your return-statement / constructor-call, you do have one (or more) very obvious pointer de-referencing, as part of the id->get_string()
call.
For one, the id
pointer is created as so:
const Token* const id = current_token(); // store identifier
next_token(); // consume identifier
Unless there is a guarantee that any pointer returned by current_token()
will be valid until the end of time (or within the life-time of the current parsing operation), it is very possible that after the call to next_token()
, the id
pointer is invalid, i.e., pointing to a non-existent or defunct Token
object.
Even if the id
pointer still points to an existing Token
object, it is possible that it is in a "zombie" state, and that obtaining a string from it, through get_string()
, is an invalid operation.
If I were you, that is where I would be looking for the source of the seg-fault. You might also want to run this in a (memory-)debugger to get to the source of it, it will likely point you to the get_string
function as the source of it, either during the dereferencing of the this
pointer (the id
pointer) or during the construction of the string itself. It could also point you towards the virtual-table look-up, if get_string
is a virtual function in the Token
class. Either way, I highly suspect that this is the cause of the seg-fault, because it is the only overtly dangerous code in what you have posted.
Upvotes: 2