C++ lambda by-value capture semantics and allowed optimizations

Question

What is the compiler allowed to omit from by-value default captures, when only some data members of an implicitly captured object are actually used by the functor? E.g.,

struct A {
  // some members we care about:
  char x;
  int y;
  // some huge amount of state we do not:
  std::array z;

  int foo() const { return y + 1 }
};

void bar() {
  A a;
  // must the entirety of a be copy captured, or is the compiler allowed to pick/prune?
  auto l1 = [=](){ std::cout << a.x << ", " << a.y << std::endl; };
  // ...
}

Similarly, when if ever is early evaluation allowed to omit broader captures?

void baz(int i) {
  A a2;
  a2.y = i;

  // capture fundamentally only needs 1 int, not all of an A instance.
  auto l2 = [=](){ std::cout << a.foo() << std::endl; }
}

There are at least some situations where making a partial vs. complete copy capture of an element should have no visible external effects beyond lambda size, but I do not know where in the spec to look for the answer to what optimizations are allowable.

Michael Kenzel · Accepted Answer

I think that, in principle, a compiler would be allowed to optimize this in a way that would capture only a copy of the used member under the as-if rule. The relevant part of [expr.prim.lambda] §2:

[…] An implementation may define the closure type differently from what is described below provided this does not alter the observable behavior of the program other than by changing:

the size and/or alignment of the closure type,

whether the closure type is trivially copyable, or

whether the closure type is a standard-layout class.

However, in a quick test checking for sizeof() of the closure type, none of the major compilers (clang, gcc, msvc) seemed to optimize the closure type itself in such ways.

It should be noted, though, that this only really becomes an issue as soon as you actually store the object obtained from a lambda expression somewhere (e.g. in an std::function). More often than not, the result of a lambda expression will simply be used as an argument to some function template and then thrown away. In such a case, where everything ends up being inlined, the optimizer should (and did in my tests) just throw away code generated for copying around data that is never referenced…

C++ lambda by-value capture semantics and allowed optimizations

Answers (1)

Related Questions