Ruperrrt
Ruperrrt

Reputation: 529

Dangling reference when returning reference to reference parameter bound to temporary

This question refers to Howard Hinnant's answer to the question Guaranteed elision and chained function calls.

At the bottom of his answer, he says:

Note that in this latest design, if your client ever does this:

X&& x = a + b + c;

then x is a dangling reference (which is why std::string does not do this).

The paragraph "Lifetime of a temporary" of the article "Reference initialization" on cppreference.com lists exceptions to the lifetime rules of temporary objects bound to a reference. One being:

"a temporary bound to a reference parameter in a function call exists until the end of the full expression containing that function call: if the function returns a reference, which outlives the full expression, it becomes a dangling reference."

I think it is rather meant "if the function returns a reference to the temporary object the reference parameter is bound to", not just some other reference. As a consequence, I reckon this is the rule that explains Howard Hinnant's above-mentioned statement.

The following example is based on the example given in the question that I am referring to:

struct X
{
    int _x;
    X() : _x(0) {}
    X(int x) : _x(x) {}
    X(X const& other) : _x(other._x) {}
    X(X&& other) noexcept : _x(other._x) { other._x = 0; std::cout << "Move from " << &other << " to " << this << std::endl; }
    X& operator+=(const X& other) { _x += other._x; return *this; }

    friend X operator+(X const& lhs, X const& rhs)
    {
        std::cout << "X const& lhs: " << &lhs << std::endl;
        X temp = lhs;
        temp += rhs;
        return temp;
    }

    friend X&& operator+(X&& lhs, X const& rhs)
    {
        std::cout << "X&& lhs: " << &lhs << std::endl;
        lhs += rhs;
        return std::move(lhs);
    }
};

int anotherFunc(int a)
{
    int bigArray[3000]{};
    std::cout << "ignore:" << &bigArray << std::endl;
    int b = a * a;
    std::cout << "int b: " << &b << std::endl;
    return 2 * b;
}

int main()
{
    X a(1), b(2), c(3), d(4);
    X&& sum = a + b + c + d;
    std::cout << "X&& sum: " << &sum << std::endl;
    anotherFunc(15);
    std::cout << "sum._x: " << sum._x << std::endl;

    return 0;
}

This prints

X const& lhs: 000000907DAFF8B4
Move from 000000907DAFF794 to 000000907DAFFA14
X&& lhs: 000000907DAFFA14
X&& lhs: 000000907DAFFA14
X&& sum: 000000907DAFFA14
ignore:000000907DAFC360
int b: 000000907DAFF254
sum._x: 10

when compiled with MSVC; and similar outputs when compiled with gcc or clang.

sum should be a dangling reference here. Still, the correct value "10" is being printed. It even works when pushing a large array onto the stack between the reference initialization of sum and the access via said reference. The memory used for the temporary object that sum refers does not get reused and is always allocated elsewhere (in relation to the stack frame of the following function call), no matter how big or small the next stack frame is.

Why does every compiler that I've tested preserve the temporary object local to X&& operator+(X&& lhs, X const& rhs) even though sum should be a dangling reference according to the rule on cppreference.com. Or, to be more precise: Despite accessing a dangling reference being undefined behaviour, why does every compiler implement it that way?

Upvotes: 3

Views: 700

Answers (1)

Howard Hinnant
Howard Hinnant

Reputation: 218720

I like to keep an example class A around for situations like this. The full definition of A is a little too lengthy to list here, but it is included in its entirety at this link.

In a nutshell, A keeps a state and a status, and the status can be one of these enums:

    destructed             = -4,
    self_move_assigned     = -3,
    move_assigned_from     = -2,
    move_constructed_from  = -1,
    constructed_specified  =  0

That is, the special members set the status accordingly. For example ~A() looks like this:

~A()
{
    assert(is_valid());
    --count;
    state_ = randomize();
    status_ = destructed;
}

And there's a streaming operator that prints this class out.

Language lawyer disclaimer: Printing out a destructed A is undefined behavior, and anything could happen. That being said, when experiments are compiled with optimizations turned off, you typically get the expected result.

For me, using clang at -O0, this:

#include "A.h"
#include <iostream>

int
main()
{
    A a{1};
    A b{2};
    A c{3};
    A&& x = a + b + c;
    std::cout << x << '\n';
}

Outputs:

destructed: -1002199219

Changing the line to:

    A x = a + b + c;

Results in:

6

Upvotes: 1

Related Questions