Seg. fault with std::unique_ptr and ctor

Question

For a parser I am actually implementing I partially have these private functions within the parser:

Parser private methods:

    Token const* current_token() const;
    Token const* next_token();
    Token const* peek_token();

    std::unique_ptr parse_expression();
    std::unique_ptr parse_type_specifier();
    std::unique_ptr parse_variable_declaration();
    std::unique_ptr parse_function_definition();
    std::unique_ptr parse_top_level_statement();

the implementation of the parse_variable_declaration method is this:

parse_variable_declaration():

std::unique_ptr Parser::parse_variable_declaration() {
    next_token(); // consume 'var'

    if (current_token()->get_type() != TokenTypes::identifier) {
        throw parser_error(current_token(), "", "expected identifier
");
    }
    const auto id = current_token(); // store identifier
    next_token(); // consume identifier

    std::unique_ptr type;
    std::unique_ptr expr;

    auto assignment_required = true;
    if (current_token()->get_type() == TokenTypes::op_colon) { // optional type specifier
        next_token(); // consume ':'

        type = parse_type_specifier();
        assignment_required = false;
    }

    if (assignment_required && current_token()->get_type() != TokenTypes::op_equals) {
        throw parser_error(current_token(), "", "expected equals operator
");
    }

    if (current_token()->get_type() == TokenTypes::op_equals) {
        next_token(); // consume '='

        expr = parse_expression();
    }

    if (current_token()->get_type() != TokenTypes::op_semi_colon) {
        throw parser_error(current_token(), "", "expected semi-colon
");
    }

    next_token(); // consume ';'

    DEBUG_STDERR("parsed: variable_declaration_statement
");
    return std::make_unique(
        id->get_string(), std::move(type), std::move(expr));
}

the last line (the return) ends in a segmentation fault. it basically calls the constructor of VariableDeclarationStatement:

VariableDeclarationStatement ctor:

VariableDeclarationStatement::VariableDeclarationStatement(
    std::string const& name,
    std::unique_ptr type_specifier,
    std::unique_ptr expr
):
    m_name{name},
    m_type_specifier{std::move(type_specifier)},
    m_expr{std::move(expr)}
{}

I am debugging this things since yesterday and can't seem to find out why this does not work as intended. I want to build the Abstract Syntax Tree (parser output) with unique pointers to their child nodes (because they are the only owner of their childs which makes sense) - this is why I am try-harding to work with them.

Console output: DEBUG_STDERR

parsed: primitive_type_int // from parse_type_specifier()
parsed: integral_expression // from parse_expression()
parsed: variable_declaration_statement
[1]    12638 segmentation fault (core dumped)  ./cion_compiler

Mikael Persson · Accepted Answer

The move operations on unique pointers basically boil down to simple pointer copies. There is no reason why any implementation of unique_ptr would dereference the pointers in the process of moving them. Therefore, the likelihood that this operation is responsible for the seg-fault is virtually zero.

In your return-statement / constructor-call, you do have one (or more) very obvious pointer de-referencing, as part of the id->get_string() call.

For one, the id pointer is created as so:

  const Token* const id = current_token(); // store identifier
  next_token(); // consume identifier

Unless there is a guarantee that any pointer returned by current_token() will be valid until the end of time (or within the life-time of the current parsing operation), it is very possible that after the call to next_token(), the id pointer is invalid, i.e., pointing to a non-existent or defunct Token object.

Even if the id pointer still points to an existing Token object, it is possible that it is in a "zombie" state, and that obtaining a string from it, through get_string(), is an invalid operation.

If I were you, that is where I would be looking for the source of the seg-fault. You might also want to run this in a (memory-)debugger to get to the source of it, it will likely point you to the get_string function as the source of it, either during the dereferencing of the this pointer (the id pointer) or during the construction of the string itself. It could also point you towards the virtual-table look-up, if get_string is a virtual function in the Token class. Either way, I highly suspect that this is the cause of the seg-fault, because it is the only overtly dangerous code in what you have posted.

Seg. fault with std::unique_ptr and ctor

Answers (2)

Related Questions