Can an optimizing compiler add std::move?

Question

Can a compiler do automatic lvalue-to-rvalue conversion if it can prove that the lvalue won't be used again? Here's an example to clarify what I mean:

void Foo(vector values) { ...}

void Bar() {
  vector my_values {1, 2, 3};
  Foo(my_values);  // may the compiler pretend I used std::move here?
}

If a std::move is added to the commented line, then the vector can be moved into Foo's parameter, rather than copied. However, as written, I didn't use std::move.

It's pretty easy to statically prove that my_values won't be used after the commented line. So s the compiler allowed to move the vector, or is it required to copy it?

Yakk - Adam Nevraumont · Accepted Answer

The compiler is required to behave as-if the copy occurred from the vector to the call of Foo.

If the compiler can prove that there are is a valid abstract machine behavior with no observable side effects (within the abstract machine behavior, not in a real computer!) that involves moving the std::vector into Foo, it can do this.

In your above case, this (moving has no abstract machine visible side effects) is true; the compiler may not be able to prove it, however.

The possibly observable behavior when copying a std::vector is:

Invoking copy constructors on the elements. Doing so with int cannot be observed
Invoking the default std::allocator<> at different times. This invokes ::new and ::delete (maybe¹) In any case, ::new and ::delete has not been replaced in the above program, so you cannot observe this under the standard.
Calling the destructor of T more times on different objects. Not observable with int.
The vector being non-empty after the call to Foo. Nobody examines it, so it being empty is as-if it was not.
References or pointers or iterators to the elements of the exterior vector being different than those inside. No references, vectors or pointers are taken to the elements of the vector outside Foo.

While you may say "but what if the system is out of memory, and the vector is large, isn't that observable?":

The abstract machine does not have an "out of memory" condition, it simply has allocation sometimes failing (throwing std::bad_alloc) for non-constrained reasons. It not failing is a valid behavior of the abstract machine, and not failing by not allocating (actual) memory (on the actual computer) is also valid, so long as the non-existence of the memory has no observable side effects.

A slightly more toy case:

int main() {
  int* x = new int[std::size_t(-1)];
  delete[] x;
}

while this program clearly allocates way too much memory, the compiler is free to not allocate anything.

We can go further. Even:

int main() {
  int* x = new int[std::size_t(-1)];
  x[std::size_t(-2)] = 2;
  std::cout << x[std::size_t(-2)] << '
';
  delete[] x;
}

can be turned into std::cout << 2 << ' ';. That large buffer must exist abstractly, but as long as your "real" program behaves as-if the abstract machine would, it doesn't actually have to allocate it.

Unfortunately, doing so at any reasonable scale is difficult. There are lots and lots of ways information can leak from a C++ program. So relying on such optimizations (even if they happen) is not going to end well.

¹ There was some stuff about coalescing calls to new that might confuse the issue, I am uncertain if it would be legal to skip calls even if there was a replaced ::new.

An important fact is that there are situations that the compiler is not required to behave as-if there was a copy, even if std::move was not called.

When you return a local variable from a function in a line that looks like return X; and X is the identifier, and that local variable is of automatic storage duration (on the stack), the operation is implicitly a move, and the compiler (if it can) can elide the existence of the return value and the local variable into one object (and even omit the move).

The same is true when you construct an object from a temporary -- the operation is implicitly a move (as it is binding to an rvalue) and it can elide away the move completely.

In both these cases, the compiler is required to treat it as a move (not a copy), and it can elide the move.

std::vector foo() {
  std::vector x = {1,2,3,4};
  return x;
}

that x has no std::move, yet it is moved into the return value, and that operation can be elided (x and the return value can be turned into one object).

This:

std::vector foo() {
  std::vector x = {1,2,3,4};
  return std::move(x);
}

blocks elision, as does this:

std::vector foo(std::vector x) {
  return x;
}

and we can even block the move:

std::vector foo() {
  std::vector x = {1,2,3,4};
  return (std::vector const&)x;
}

or even:

std::vector foo() {
  std::vector x = {1,2,3,4};
  return 0,x;
}

as the rules for implicit move are intentionally fragile. (0,x is a use of the much maligned , operator).

Now, relying on implicit-move not occurring in cases like this last , based one is not advised: the standard committee has already changed an implicit-copy case to an implicit-move since implicit-move was added to the language because they deemed it harmless (where the function returns a type A with a A(B&&) ctor, and the return statement is return b; where b is of type B; at C++11 release that did a copy, now it does a move.) Further expansion of implicit-move cannot be ruled out: casting explicitly to a const& is probably the most reliable way to prevent it now and in the future.

Can an optimizing compiler add std::move?

Answers (2)

Related Questions