Willy
Willy

Reputation: 581

Behavior difference of lambda function mutable capture from a reference to global variable

I found the results are different across compilers if I use a lambda to capture a reference to global variable with mutable keyword and then modify the value in the lambda function.

#include <stdio.h>
#include <functional>

int n = 100;

std::function<int()> f()
{
    int &m = n;
    return [m] () mutable -> int {
        m += 123;
        return m;
    };
}

int main()
{
    int x = n;
    int y = f()();
    int z = n;

    printf("%d %d %d\n", x, y, z);
    return 0;
}

Result from VS 2015 and GCC (g++ (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609):

100 223 100

Result from clang++ (clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)):

100 223 223

Why does this happen? Is this allowed by the C++ Standards?

Upvotes: 22

Views: 1248

Answers (3)

Remy Lebeau
Remy Lebeau

Reputation: 595971

A lambda can't capture a reference itself by value (use std::reference_wrapper for that purpose).

In your lambda, [m] captures m by value (because there is no & in the capture), so m (being a reference to n) is first dereferenced and a copy of the thing it is referencing (n) is captured. This is no different than doing this:

int &m = n;
int x = m; // <-- copy made!

The lambda then modifies that copy, not the original. That is what you are seeing happen in the VS and GCC outputs, as expected.

The Clang output is wrong, and should be reported as a bug, if it hasn't already.

If you want your lambda to modify n, capture m by reference instead: [&m]. This is no different than assigning one reference to another, eg:

int &m = n;
int &x = m; // <-- no copy made!

Or, you can just get rid of m altogether and capture n by reference instead: [&n].

Although, since n is in global scope, it really doesn't need to be captured at all, the lambda can access it globally without capturing it:

return [] () -> int {
    n += 123;
    return n;
};

Upvotes: 15

aschepler
aschepler

Reputation: 72281

This is not allowed by the C++17 Standard, but by some other Standard drafts it might be. It's complicated, for reasons not explained in this answer.

[expr.prim.lambda.capture]/10:

For each entity captured by copy, an unnamed non-static data member is declared in the closure type. The declaration order of these members is unspecified. The type of such a data member is the referenced type if the entity is a reference to an object, an lvalue reference to the referenced function type if the entity is a reference to a function, or the type of the corresponding captured entity otherwise.

The [m] means that the variable m in f is captured by copy. The entity m is a reference to object, so the closure type has a member whose type is the referenced type. That is, the member's type is int, and not int&.

Since the name m inside the lambda body names the closure object's member and not the variable in f (and this is the questionable part), the statement m += 123; modifies that member, which is a different int object from ::n.

Upvotes: 4

walnut
walnut

Reputation: 22152

I think Clang may actually be correct.

According to [lambda.capture]/11, an id-expression used in the lambda refers to the lambda's by-copy-captured member only if it constitutes an odr-use. If it doesn't, then it refers to the original entity. This applies to all C++ versions since C++11.

According to C++17's [basic.dev.odr]/3 a reference variable is not odr-used if applying lvalue-to-rvalue conversion to it yields a constant expression.

In the C++20 draft however the requirement for the lvalue-to-rvalue conversion is dropped and the relevant passage changed multiple times to include or not include the conversion. See CWG issue 1472 and CWG issue 1741, as well as open CWG issue 2083.

Since m is initialized with a constant expression (referring to a static storage duration object), using it yields a constant expression per exception in [expr.const]/2.11.1.

This is not the case however if lvalue-to-rvalue conversions are applied, because the value of n is not usable in a constant expression.

Therefore, depending on whether or not lvalue-to-rvalue conversions are supposed to be applied in determining odr-use, when you use m in the lambda, it may or may not refer to the member of the lambda.

If the conversion should be applied, GCC and MSVC are correct, otherwise Clang is.

You can see that Clang changes it behavior if you change the initialization of m to not be a constant expression anymore:

#include <stdio.h>
#include <functional>

int n = 100;

void g() {}

std::function<int()> f()
{
    int &m = (g(), n);
    return [m] () mutable -> int {
        m += 123;
        return m;
    };
}

int main()
{
    int x = n;
    int y = f()();
    int z = n;

    printf("%d %d %d\n", x, y, z);
    return 0;
}

In this case all compilers agree that the output is

100 223 100

because m in the lambda will refer to the closure's member which is of type int copy-initialized from the reference variable m in f.

Upvotes: 5

Related Questions