Ricardo C
Ricardo C

Reputation: 33

When does iteration variable in for loop increment

I'm currently reading Albahari's O'Reily book, C# in a Nutshell and am in the Linq Query chapter. He is describing the effect of delayed execution and variable capturing when making Linq querys. He gives the following example of a common mistake:

IEnumerable<char> query = "Not what you might expect";
string vowels = "aeiou";
for (int i = 0; i < vowels.Length; i++)
{
    query = query.Where(c => c != vowels[i]);
}
foreach (var c in query)
{
    Console.WriteLine(c);
}
Console.Read();

An IndexOutOfRangeException is thrown once the query is enumerated but this doesn't make any sense to me. I would expect that the lambda expression in the Where operator c => c!= vowles[i] would simply evaluate at c => c != vowels[4] for the entire sequence, due to the effect of delayed execution and variable capturing. I went ahead and debugged to see what value i had when the exception is thrown and found out it had the value of 5? So i went ahead and changed the condition clause, in the for loop, to i < vowels.Length-1; and indeed no exception was thrown. Is the for loop iterating the i at the very last iteration to 5 or is linq doing somenthing else?

Upvotes: 3

Views: 3792

Answers (3)

Ben N
Ben N

Reputation: 2923

This is your lambda, a function that is declared inside another and can refer to variables from the parent function:

c => c != vowels[i]

The Where function doesn't actually call the lambda function until you try to iterate over the resulting sequence in your foreach loop. Unlike normal instructions where you use the value of the variable (e.g. Console.WriteLine(i);), i inside the lambda refers to the actual variable i. Therefore, once you're done with the first loop, every single lambda you created is referring to the same variable i.

When the lambda finally is evaluated, i is vowels.Length, an index outside the bounds of the sequence you try to access. Your program then crashes.

You should change your for loop to this:

for (int i = 0; i < vowels.Length; i++)
{
    int index = i;
    query = query.Where(c => c != vowels[index]);
}

The index variable is recreated on every iteration of the loop, so each lambda you create references a different variable with a different value.

Upvotes: 1

user1845593
user1845593

Reputation: 1824

To help you to understand this, try to debug the following code and look at output window.

    private void button1_Click(object sender, EventArgs e)
    {
        IEnumerable<char> query = "Not what you might expect";
        string vowels = "aeiou";
        for (int i = 0; i < vowels.Length; i++)
        {
            Console.WriteLine("out: " + i);
            query = query.Where(c =>
            {
                Console.WriteLine("inner: " + i);
                return c != vowels[i];
            });
        }
        Console.WriteLine("before query");
        foreach (var c in query)
        {
            Console.WriteLine(c);
        }
        Console.Read();
    }

Upvotes: 0

Lasse V. Karlsen
Lasse V. Karlsen

Reputation: 391396

For all intents and purposes (other than captured variables that is), this:

for (int i = 0; i < 10; i++)
    ....

can be rewritten as:

int i = 0;
while (i < 10)
{
    ....
    i++;
}

So as you see, the iteration stops only when the condition is false, and for the condition to be false, i has to be equal to or greater than 10.

In fact, if I try this program in LINQPad:

void Main() { }

public static void Test1()
{
    for (int i = 0; i < 10; i++)
        Console.WriteLine(i);
}

public static void Test2()
{
    int i = 0;
    while (i < 10)
    {
        Console.WriteLine(i);
        i++;
    }
}

And then check the generated IL, let me put the two methods side by side:

Test1:                                            Test2:
IL_0000:  ldc.i4.0                                IL_0000:  ldc.i4.0    
IL_0001:  stloc.0     // i                        IL_0001:  stloc.0     // i
IL_0002:  br.s        IL_000E                     IL_0002:  br.s        IL_000E
IL_0004:  ldloc.0     // i                        IL_0004:  ldloc.0     // i
IL_0005:  call        System.Console.WriteLine    IL_0005:  call        System.Console.WriteLine
IL_000A:  ldloc.0     // i                        IL_000A:  ldloc.0     // i
IL_000B:  ldc.i4.1                                IL_000B:  ldc.i4.1    
IL_000C:  add                                     IL_000C:  add         
IL_000D:  stloc.0     // i                        IL_000D:  stloc.0     // i
IL_000E:  ldloc.0     // i                        IL_000E:  ldloc.0     // i
IL_000F:  ldc.i4.s    0A                          IL_000F:  ldc.i4.s    0A 
IL_0011:  blt.s       IL_0004                     IL_0011:  blt.s       IL_0004
IL_0013:  ret                                     IL_0013:  ret         

Then you can see that it generated the exact same code.

Now, the compiler will ensure you cannot write code after the for-loop that tries to access the variable, but if you capture the variable, as your code shows, then you will access the variable as it was when the loop ended, and the loop will only end by itself when the condition is false.

As such, your assumption that i would equal the index of the last character in the string is false, it will equal the index just past it and thus you'll get index out of range exceptions when you try to execute the delegate.

Here's a simple .NET Fiddle that demonstrates, this program:

using System;

public class Program
{
    public static void Main()
    {
        Action a = null;
        for (int index = 0; index < 10; index++)
            a = () => Console.WriteLine(index);

        a();
    }
}

outputs 10.

Upvotes: 7

Related Questions