Vincent
Vincent

Reputation: 60361

Cost of capture by reference/value in lambda function?

Consider the following code :

#include <iostream>
#include <algorithm>
#include <numeric>

int main()
{
    const unsigned int size = 1000;
    std::vector<int> v(size);
    unsigned int cst = size/2;
    std::iota(v.begin(), v.end(), 0);
    std::random_shuffle(v.begin(), v.end());
    std::cout<<std::find_if(v.begin(), v.end(), [&cst](const int& i){return i == cst;})-v.begin()<<std::endl;
    std::cout<<std::find_if(v.begin(), v.end(), [=](const int& i){return i == cst;})-v.begin()<<std::endl;
    return 0;
}

This code fills a vector with values, shuffles it and then searches the index of a specified value (it is just an example to illustrate my problem). This value cst can be captured by reference or by value in the lambda function.

My question: is there a difference in performance between the two versions or will they be optimized in the same way by the compiler?

Is is a good rule to pass constant fundamental types by value and constant classes by reference (like in normal functions) ?

Upvotes: 6

Views: 4707

Answers (3)

kunysch
kunysch

Reputation: 507

In practice there is no performance difference for small types.

With clang -O3 I get identical code in both cases. Without optimizations clang generates different code and the copying version happens to be one instruction smaller.

$ clang ref.cpp -O3 -std=c++11 -S -o ref.s
$ clang cpy.cpp -O3 -std=c++11 -S -o cpy.s
$ diff ref.s cpy.s

There is a small const-related difference.

The copy-capture gives you a const unsigned value. This will not compile:

unsigned cst = 123;
[=](const int& i){ return i == ++cst; }

The reference-capture of a non-const variable results in a non-const unsigned& reference. This modifies the original value as a side-effect:

unsigned cst = 123;
[&](const int& i){ return i == ++cst; }

As a good rule copying of large objects should be avoided. If small objects should be constant in the lambda's scope, but aren't constant in the current scope, copy-capture is a good choice. If the life-time of the lambda exceeds the life-time of your local object copy-capture is the only choice.

Upvotes: 7

aaronman
aaronman

Reputation: 18750

cst is an unsigned int so it is unlikely to make a difference. However if you do this with a large class that has a significant amount of data in it it may make a difference, passing by reference will be faster.

Another thing to consider in this case is that the object is only copied once while the vector is being iterated over. If you take a look at the STL functions most things are passed by const reference or normal reference, I don't see why capturing variables should be any different. Though unfortunately you can't capture a variable as const.

Of course you always have to be careful when passing by reference because you can modify it, I think in this case it may be better to just pass as a const reference.

One last thing to consider is that since the compiler may be able to optimize the difference away you should probably just use the form that you feel specifies your intent best. So basically I agree with your assumption that you should

pass constant fundamental types by value and constant classes by reference

Upvotes: 1

Andrew Tomazos
Andrew Tomazos

Reputation: 68618

The lambda capture isn't really relevant. The difference is between:

int x = y;

for (...)
    if (x == z)
       ...

and

const int& x = y;

for (...)
    if (x == z)
       ...

That is, storing a reference to const int vs taking a copy.of an int. The first version will never be slower, but I think the optimizer will manage to produce the same code for both. Compile both versions, and disassemble to see what happens.

Upvotes: 2

Related Questions