Reputation: 800

Linq late binding confusion

Can someone please explain me what I am missing here. Based on my basic understanding linq result will be calculated when the result will be used and I can see that in following code.

 static void Main(string[] args)
 {
     Action<IEnumerable<int>> print = (x) =>
     {
         foreach (int i in x)
         {
             Console.WriteLine(i);
         }
     };

     int[] arr = { 1, 2, 3, 4, 5 };
     int cutoff = 1;
     IEnumerable<int> result = arr.Where(x => x < cutoff);
     Console.WriteLine("First Print");
     cutoff = 3;
     print(result);
     Console.WriteLine("Second Print");
     cutoff = 4;
     print(result);
     Console.Read();
}

Output:

First Print
1
2
Second Print
1
2
3

Now I changed the

arr.Where(x => x < cutoff);

IEnumerable<int> result = arr.Take(cutoff);

and the output is as follow.

First Print
1
Second Print
1

Why with Take, it does not use the current value of the variable?

Upvotes: 3

Answers (3)

Jon Hanna

Reputation: 113292

There's a few different things getting confused here.

Late-binding: This is where the meaning of code is determined after it was compiled. For example, x.DoStuff() is early-bound if the compiler checks that objects of x's type have a DoStuff() method (considering extension methods and default arguments too) and then produces the call to it in the code it outputs, or fails with a compiler error otherwise. It is late-bound if the search for the DoStuff() method is done at run-time and throws a run-time exception if there was no DoStuff() method. There are pros and cons to each, and C# is normally early-bound but has support for late-binding (most simply through dynamic but the more convoluted approaches involving reflection also count).

Delayed execution: Strictly speaking, all Linq methods immediately produce a result. However, that result is an object which stores a reference to an enumerable object (often the result of the previous Linq method) which it will process in an appropriate manner when it is itself enumerated. For example, we can write our own Take method as:

private static IEnumerable<T> TakeHelper<T>(IEnumerable<T> source, int number)
{
  foreach(T item in source)
  {
    yield return item;
    if(--number == 0)
      yield break;
  }
}
public static IEnumerable<T> Take<T>(this IEnumerable<T> source, int number)
{
  if(source == null)
    throw new ArgumentNullException();
  if(number < 0)
    throw new ArgumentOutOfRangeException();
  if(number == 0)
    return Enumerable.Empty<T>();
  return TakeHelper(source, number);
}

Now, when we use it:

var taken4 = someEnumerable.Take(4);//taken4 has a value, so we've already done
                                    //something. If it was going to throw
                                    //an argument exception it would have done so
                                    //by now.

var firstTaken = taken4.First();//only now does the object in taken4
                                        //do the further processing that iterates
                                        //through someEnumerable.

Captured variables: Normally when we make use of a variable, we make use of how its current state:

int i = 2;
string s = "abc";
Console.WriteLine(i);
Console.WriteLine(s);
i = 3;
s = "xyz";

It's pretty intuitive that this prints 2 and abc and not 3 and xyz. In anonymous functions and lambda expressions though, when we make use of a variable we are "capturing" it as a variable, and so we will end up using the value it has when the delegate is invoked:

int i = 2;
string s = "abc";
Action λ = () =>
{
  Console.WriteLine(i);
  Console.WriteLine(s);
};
i = 3;
s = "xyz";
λ();

Creating the λ doesn't use the values of i and s, but creates a set of instructions as to what to do with i and s when λ is invoked. Only when that happens are the values of i and s used.

Putting it all together: In none of your cases do you have any late-binding. That is irrelevant to your question.

In both you have delayed execution. Both the call to Take and the call to Where return enumerable objects which will act upon arr when they are enumerated.

In only one do you have a captured variable. The call to Take passes an integer directly to Take and Take makes use of that value. The call to Where passes a Func<int, bool> created from a lambda expression, and that lambda expression captures an int variable. Where knows nothing of this capture, but the Func does.

That's the reason the two behave so differently in how they treat cutoff.

Upvotes: 4

JaredPar

Reputation: 754833

The behavior your seeing comes from the different way in which the arguments to the LINQ functions are evaluated. The Where method recieves a lambda which captures the value cutoff by reference. It is evaluated on demand and hence sees the value of cutoff at that time.

The Take method (and similar methods like Skip) take an int parameter and hence cutoff is passed by value. The value used is the value of cutoff at the moment the Take method is called, not when the query is evaluated

Note: The term late binding here is a bit incorrect. Late binding generally refers to the process where the members an expression binds to are determined at runtime vs. compile time. In C# you'd accomplish this with dynamic or reflection. The behavior of LINQ to evaluate it's parts on demand is known as delayed execution.

Upvotes: 6

Femaref

Reputation: 61437

Take doesn't take a lambda, but an integer, as such it can't change when you change the original variable.

Upvotes: 1

Linq late binding confusion

Answers (3)

Related Questions