Xarbrough
Xarbrough

Reputation: 1461

Why does this lambda closure generate garbage although it is not executed at runtime?

I've noticed that the following code generates heap allocations which trigger the garbage collector at some point and I would like to know why this is the case and how to avoid it:

private Dictionary<Type, Action> actionTable = new Dictionary<Type, Action>();

private void Update(int num)
{
    Action action;
//  if (!actionTable.TryGetValue(typeof(int), out action))
    if (false)
    {
        action = () => Debug.Log(num);
        actionTable.Add(typeof(int), action);
    }
    action?.Invoke();
}

I understand that using a lambda such as () => Debug.Log(num) will generate a small helper class (e.g. <>c__DisplayClass7_0) to hold the local variable. This is why I wanted to test if I could cache this allocation in a dictionary. However, I noticed, that the call to Update leads to allocations even when the lambda code is never reached due to the if-statement. When I comment out the lambda, the allocation disappears from the profiler. I am using the Unity Profiler tool (a performance reporting tool within the Unity game engine) which shows such allocations in bytes per frame while in development/debug mode.

I surmise that the compiler or JIT compiler generates the helper class for the lambda for the scope of the method even though I don't understand why this would be desirable.

Finally, is there any way of caching delegates in this manner without allocating and without forcing the calling code to cache the action in advance? (I do know, that I could also allocate the action once in the client code, but in this example I would strictly like to implement some kind of automatic caching because I do not have complete control over the client).

Disclaimer: This is mostly a theoretical question out of interest. I do realize that most applications will not benefit from micro-optimizations like this.

Upvotes: 1

Views: 845

Answers (2)

Servy
Servy

Reputation: 203827

I surmise that the compiler or JIT compiler generates the helper class for the lambda for the scope of the method even though I don't understand why this would be desirable.

Consider the case where there's more than one anonymous method with a closure in the same method (a common enough occurrence). Do you want to create a new instance for every single one, or just have them all share a single instance? They went with the latter. There are advantages and disadvantages to either approach.

Finally, is there any way of caching delegates in this manner without allocating and without forcing the calling code to cache the action in advance?

Simply move that anonymous method into its own method, so that when that method is called the anonymous method is created unconditionally.

private void Update(int num)
{
    Action action = null;
    //  if (!actionTable.TryGetValue(typeof(int), out action))
    if (false)
    {
        Action CreateAction()
        {
            return () => Debug.Log(num);
        }
        action = CreateAction();
        actionTable.Add(typeof(int), action);
    }
    action?.Invoke();
}

(I didn't check if the allocation happened for a nested method. If it does, make it a non-nested method and pass in the int.)

Upvotes: 1

Eric Lippert
Eric Lippert

Reputation: 660169

Servy's answer is correct and gives a good workaround. I thought I might add a few more details.

First off: implementation choices of the C# compiler are subject to change at any time and for any reason; nothing I say here is a requirement of the language and you should not depend on it.

If you have a closed-over outer variable of a lambda then all closed-over variables are made into fields of a closure class, and that closure class is allocated from the long-term pool ("the heap") as soon as the function is activated. This happens regardless of whether the closure class is ever read from.

The compiler team could have chosen to defer creation of the closure class until the first point where it was used: where a local was read or written or a delegate was created. However, that would then add additional complexity to the method! That makes the method larger, it makes it slower, it makes it more likely that you'll have a cache miss, it makes the jitter work harder, it makes more basic blocks so the jitter might skip an optimization, and so on. This optimization likely does not pay for itself.

However, the compiler team does make similar optimizations in cases where it is more likely to pay off. Two examples:

  • The 99.99% likely scenario for an iterator block (a method with a yield return in it) is that the IEnumerable will have GetEnumerator called exactly once. The generated enumerable therefore has logic that implements both IEnumerable and IEnumerator; the first time GetEnumerator is called, the object is cast to IEnumerator and returned. The second time, we allocate a second enumerator. This saves one object in the highly likely scenario, and the extra code generated is pretty simple and rarely called.
  • It is common for async methods to have a "fast path" that returns without ever awaiting -- for example, you might have an expensive asynchronous call the first time, and then the result is cached and returned the second time. The C# compiler generates code that avoids creating the "state machine" closure until the first await is encountered, and therefore prevents an allocation on the fast path, if there is one.

These optimizations tend to pay off, but 99% of the time when you have a method that makes a closure, it actually makes the closure. It's not really worth deferring it.

Upvotes: 2

Related Questions