Dinesh Maurya
Dinesh Maurya

Reputation: 852

Capturing by value in recursive lambda

We can define recursive lambda function like

std::function<void(int)> fun = [&fun](int a) {  if (a) fun(a - 1); };

then we can call it with

fun(10);

However if I change the definition to

std::function<void(int)> fun = [fun](int a) {  if (a) fun(a - 1); };

and then try it to call with

fun(10);

segmentation fault occurs.

Can someone explain about why capture by reference works while capture by value gives segmentation fault.

Upvotes: 5

Views: 489

Answers (2)

Yakk - Adam Nevraumont
Yakk - Adam Nevraumont

Reputation: 275760

Using std::function for a recursive lambda isn't a good plan. In your case, you get an uninitialized copy of the function prior to it having the lambda within it.

Which seems bad. Undefined behavior crashes when you are lucky.

Recursive Lambdas

Let's say we wish to write Euclid's gcd() as a lambda. As a function, it is:

int gcd(int a, int b) {
    return b == 0 ? a : gcd(b, a%b);
}

But a lambda cannot be recursive, it has no way to invoke itself. A lambda has no name and using this within the body of a lambda refers to a captured this (assuming the lambda is created in the body of a member function, otherwise it is an error). So how do we solve this problem?

Use std::function

We can have a lambda capture a reference to a not-yet constructed std::function:

std::function<int(int, int)> gcd = [&](int a, int b){
    return b == 0 ? a : gcd(b, a%b);
};

This works, but should be used sparingly. It's slow (we're using type erasure now instead of a direct function call), it's fragile (copying gcd or returning gcd will break since the lambda refers to the original object), and it won't work with generic lambdas.

Using two smart pointers:

auto gcd_self = std::make_shared<std::unique_ptr< std::function<int(int, int)> >>();
*gcd_self = std::make_unique<std::function<int(int, int)>>(
  [gcd_self](int a, int b){
    return b == 0 ? a : (**gcd_self)(b, a%b);
  };
};

This adds a lot of indirection (which is overhead), but it can be copied/returned, and all copies share state. It does let you return the lambda, and is otherwise less fragile than the above solution.

Use a Y-combinator

With the help of a short utility struct, we can solve all of these problems:

template <class F>
struct y_combinator {
    F f; // the lambda will be stored here

    // a forwarding operator():
    template <class... Args>
    decltype(auto) operator()(Args&&... args) const {
        // we pass ourselves to f, then the arguments.
        // the lambda should take the first argument as `auto&& recurse` or similar.
        return f(*this, std::forward<Args>(args)...);
    }
};
// helper function that deduces the type of the lambda:
template <class F>
y_combinator<std::decay_t<F>> make_y_combinator(F&& f) {
    return {std::forward<F>(f)};
}
// (Be aware that in C++17 we can do better than a `make_` function)

we can implement our gcd as:

auto gcd = make_y_combinator(
  [](auto&& gcd, int a, int b){
    return b == 0 ? a : gcd(b, a%b);
  }
);

The y_combinator is a concept from the lambda calculus that lets you have recursion without being able to name yourself until you are defined. This is exactly the problem lambdas have.

You create a lambda that takes "recurse" as its first argument. When you want to recurse, you pass the arguments to recurse.

The y_combinator then returns a function object that calls that function with its arguments, but with a suitable "recurse" object (namely the y_combinator itself) as its first argument. It forwards the rest of the arguments you call the y_combinator with to the lambda as well.

In short:

auto foo = make_y_combinator( [&](auto&& recurse, some arguments) {
  // write body that processes some arguments
  // when you want to recurse, call recurse(some other arguments)
});

and you have recursion in a lambda with no serious restrictions or significant overhead.

Part of this answer (The Recursive Lambda) was originally written by @Barry on the defunct Stack Overflow Documentation.

Upvotes: 1

Capture by value is evaluated as part of evaluating the lambda expression. At that time, fun is still uninitialised, because you're still evaluating its initialiser. Only after that is fun initialised, but by that time the copy has already happened.

The net effect is that the lambda function object stored inside fun has a data member named fun which is a copy of an uninitalised std::function — Undefined Behaviour.

Upvotes: 8

Related Questions