Reputation: 68718

Why not always assign return values to const reference?

Let's say I have some function:

Foo GetFoo(..)
{
  ...
}

Assume that we neither know how this function is implemented nor the internals of Foo (it can be very complex object, for example). However we do know that function is returning Foo by value and that we want to use this return value as const.

Question: Would it be always a good idea to store return value of this function as const &?

const Foo& f = GetFoo(...);

instead of,

const Foo f = GetFoo(...);

I know that compilers would do return value optimizations and may be move the object instead of copying it so in the end const & might not have any advantages. However my question is, are there any disadvantages? Why shouldn't I just develop muscle memory to always use const & to store return values given that I don't have to rely on compiler optimizations and the fact that even move operation can be expensive for complex objects.

Stretching this to extreme, why shouldn't I always use const & for all variables that are immutable in my code? For example,

const int& a = 2;
const int& b = 2;
const int& c = c + d;

Besides being more verbose, are there any disadvantages?

Upvotes: 48

Answers (4)

Leon

Reputation: 32464

The semantic difference between const C& and const C in the considered case (when selecting the type for a variable) can affect your program in the cases listed below. They must be taken into account not only when writing new code, but also during subsequent maintenance, since certain changes to the source code may change where a variable definition belongs in this classification.

Initializer is an lvalue of exactly type `C`

const C& foo();
const C  a = foo(); // (1)
const C& b = foo(); // (2)

(1) introduces an independent object (to an extent allowed by the copy semantics of the type C), whereas (2) creates an alias to another object and is subject to all changes happening to that object (including its end-of-life).

Initializer is an lvalue of a type derived from `C`

struct D : C { ... };
const D& foo();
const C  a = foo(); // (1)
const C& b = foo(); // (2)

(1) is a sliced version of what was returned from foo(). (2) is bound to the derived object and can enjoy the benefits of polymorphic behavior (if any), though at the risk of being bitten by aliasing problems.

Initializer is an rvalue of a type derived from `C`

struct D : C { ... };
D foo();
const C  a = foo(); // (1)
const C& b = foo(); // (2)

For (1), this is no different from the previous case. Regarding (2), there is no more aliasing! The constant reference is bound to the temporary of derived type, whose lifetime extends to the end of the enclosing scope, with the correct destructor (~D()) automatically being called. (2) can enjoy the benefits of polymorphism, but pays the price of the extra resources consumed by D compared to C.

Initializer is an rvalue of a type convertible to an lvalue of type C

struct B {
    C c;
    operator const C& () const { return c; }
};
const B foo();
const C  a = foo(); // (1)
const C& b = foo(); // (2)

(1) makes its copy and goes on, while (2) is in trouble starting immediately from the next statement, since it aliases a sub-object of a dead object!

Upvotes: 5

Yakk - Adam Nevraumont

Reputation: 275350

Calling elision an "optimization" is a misconception. Compilers are permitted not to do it, but they are also permitted to implement a+b integer addition as a sequence of bitwise operations with manual carry.

A compiler which did that would be hostile: so too a compiler that refuses to elide.

Elision is not like "other" optimizations, as those rely on the as-if rule (behaviour may change so long as it behaves as-if the standard dictates). Elision may change the behaviour of the code.

As to why using const & or even rvalue && is a bad idea, references are aliases to an object. With either, you do not have a (local) guarantee that the object will not be manipulated elsewhere. In fact, if the function returns a &, const& or &&, the object must exist elsewhere with another identity in practice. So your "local" value is instead a reference to some unknown distant state: this makes the local behaviour difficult to reason about.

Values, on the other hand, cannot be aliased. You can form such aliases after creation, but a const local value cannot be modified under the standard, even if an alias exists for it.

Reasoning about local objects is easy. Reasoning about distributed objects is hard. References are distributed in type: if you are choosing between a case of reference or value and there is no obvious performance cost to the value, always choose values.

To be concrete:

Foo const& f = GetFoo();

could either be a reference binding to a temporary of type Foo or derived returned from GetFoo(), or a reference bound to something else stored within GetFoo(). We cannot tell from that line.

Foo const& GetFoo();

Foo GetFoo();

make f have different meanings, in effect.

Foo f = GetFoo();

always creates a copy. Nothing that does not modify "through" f will modify f (unless its ctor passed a pointer to itself to someone else, of course).

If we have

const Foo f = GetFoo();

we even have the guarantee that modifying (non-mutable parts of) f is undefined behavior. We can assume f is immutable, and in fact the compiler will do so.

In the const Foo& case, modifying f can be defined behavior if the underlying storage was non-const. So we cannot assume f is immutable, and the compiler will only assume it is immutable if it can examine all code that has validly-derived pointers or references to f and determine that none of them mutate it (even if you just pass around const Foo&, if the original object was a non-const Foo, it is legal to const_cast<Foo&> and modify it).

In short, don't premature pessimize and assume elision "won't happen". There are very few current compilers that won't elide without explicity turning it off, and you almost certainly won't be building a serious project on them.

Upvotes: 38

David Schwartz

Reputation: 182753

These have semantic differences and if you ask for something other than you want, you will be in trouble if you get it. Consider this code:

#include <stdio.h>

class Bar
{
    public:
    Bar() { printf ("Bar::Bar\n"); }
    ~Bar() { printf ("Bar::~Bar\n"); }
    Bar(const Bar&) { printf("Bar::Bar(const Bar&)\n"); }
    void baz() const { printf("Bar::Baz\n"); }
};

class Foo
{
    Bar bar;

    public:
    Bar& getBar () { return bar; }
    Foo() { }
};

int main()
{
    printf("This is safe:\n");
    {
        Foo *x = new Foo();
        const Bar y = x->getBar();
        delete x;
        y.baz();
    }
    printf("\nThis is a disaster:\n");
    {
        Foo *x = new Foo();
        const Bar& y = x->getBar();
        delete x;
        y.baz();
    }
    return 0;
}

Output is:

This is safe:
Bar::Bar
Bar::Bar(const Bar&)
Bar::~Bar
Bar::Baz
Bar::~Bar

This is a disaster:
Bar::Bar
Bar::~Bar
Bar::Baz

Notice we call Bar::Baz after the Bar is destroyed. Oops.

Ask for what you want, that way you're not screwed if you get what you ask for.

Upvotes: 30

Mark Ransom

Reputation: 308130

Building on what @David Schwartz said in the comments, you need to be sure the semantics don't change. It isn't enough that you intend to treat the value as immutable, the function you got it from should treat it as immutable too or you're going to get a surprise.

image.SetPixel(x, y, white_pixel);
const Pixel &pix = image.GetPixel(x, y);
image.SetPixel(x, y, black_pixel);
cout << pix;

Upvotes: 14

Why not always assign return values to const reference?

Answers (4)

Initializer is an lvalue of exactly type C

Initializer is an lvalue of a type derived from C

Initializer is an rvalue of a type derived from C

Initializer is an rvalue of a type convertible to an lvalue of type C

Related Questions

Initializer is an lvalue of exactly type `C`

Initializer is an lvalue of a type derived from `C`

Initializer is an rvalue of a type derived from `C`