Konstantin Dedov
Konstantin Dedov

Reputation: 425

Nested 'froms' in LINQ

I'm new in LINQ and I have problem with nested froms:

using System;
using System.Linq;
class MultipleFroms
{
    static void Main()
    {
        char[] chrs = { 'A', 'B', 'C'};
        char[] chrs2 = { 'X', 'Y', 'Z' };
        var pairs = from ch1 in chrs
                    from ch2 in chrs2
                    select ch1+" "+ ch2;
        Console.WriteLine("For ABC and XYZ: ");
        foreach (var p in pairs)
            Console.WriteLine(p);
        Console.WriteLine();

        Console.WriteLine("For D and W: ");
        chrs = new char[] { 'D' };
        chrs2 = new char[] { 'W' };
        foreach (var p in pairs)
            Console.WriteLine(p);
    }
}

In output I have:

For ABC and XYZ:
A X
A Y
A Z
B X
B Y
B Z
C X
C Y
C Z

For D and W:
A W
B W
C W

But I expected:

...
For D and W:
D W

Why pairs in second case used "old" chrs, { 'A', 'B', 'C'} instead {'D'} ?

Upvotes: 10

Views: 591

Answers (3)

user3079266
user3079266

Reputation:

This question got several good answers which state the obvious - you need to reassign your pairs variable. However, I got more interested in the strange behaviour - as, why does reassigning chrs2 have an effect on the result of the enumeration, and reassigning chrs does not.

If we use nested from-s, looks like reassigning any of the used collections, except for the FIRST one, affects the result of the enumeration: http://ideone.com/X7f3eQ.

Now, as you should probably know, the LINQ "query syntax" is just syntactic sugar for chaining extension method calls from the System.Linq library. Let's desugar your specific example:

var pairs = from ch1 in chrs
            from ch2 in chrs2
            select ch1 + " "+ ch2;

becomes

var pairs = chrs.SelectMany(ch1 => chrs2, (ch1, ch2) => ch1 + " " + ch2);

(or, with non-extension-method syntaxis, SelectMany(chrs, ch1 => chrs2, (ch1, ch2) => ch1 + " " + ch2))

(check it here: http://ideone.com/NjVeLD)

So, what's going on? SelectMany takes chrs and two lambdas as parameters, and generates an IEnumerable out of them, which can later be enumerated to start the actual evaluation.

Now, whenever we reassign chrs2, it changes in the lambda, because it is a captured variable. However, this obviously won't work with chrs!

Upvotes: 13

Andrew Savinykh
Andrew Savinykh

Reputation: 26290

The easiest way to explain this, I can think of, is to note that this

var pairs = from ch1 in chrs
    from ch2 in chrs2
    select ch1 + " " + ch2;

Is equivalent to:

var pairs = chrs.SelectMany(ch1 => chrs2, (ch1, ch2) => ch1 + " " + ch2);

And that the compiler internally creates a closure class similar to this one:

private sealed class Closure
{
    public char[] chrs2;
    internal IEnumerable<char> Method(char ch1)
    {
        return chrs2;
    }
}

And then modifies your method to read:

static void Main()
{
    Closure closure = new Closure();
    char[] chrs = { 'A', 'B', 'C' };
    closure.chrs2 = new[] { 'X', 'Y', 'Z' };
    var pairs = chrs.SelectMany(ch1 => closure.chrs2, (ch1, ch2) => ch1 + " " + ch2);
    Console.WriteLine("For ABC and XYZ: ");
    foreach (var p in pairs)
        Console.WriteLine(p);
    Console.WriteLine();

    Console.WriteLine("For D and W: ");
    chrs = new[] { 'D' };
    closure.chrs2 = new[] { 'W' };
    foreach (var p in pairs)
        Console.WriteLine(p);
}

I hope that this way it is easy to see how you arrive to your result. Note: I made some simplifications during the explanation above to make the poitn stand out better.

The next question might be "why the compiler is doing this?". The answer is that lambda functions can be passed around and execute in different context to the one they were created it. When that happening it is often desirable to preserve state:

public Action<string> PrintCounter()
{
    int counter = 0;
    return prefix => 
        Console.WriteLine(prefix + " " + (counter++).ToString());
}

With the example above you can pass the function around as much as you like, yet, the counter is implemented each time you call it. Normally local variables such as counter live on stack, so their lifetime is of the function call, the stack is "unwind" when the function finishes execution. To get around this, closures are created, as demonstrated above. Most of the time they are extremely useful because they allow writing code that separates logic/control structures from the details of how they are going to be used. But in some degenerate cases you see the results like the one you've experienced.

Upvotes: 1

ocuenca
ocuenca

Reputation: 39326

You have to look the query as a method call where the method receive the first source of data (chrs) as parameter. The problem is you can't reassign the object to which you've already called the method after it's been setup. The second source of data (chrs2) is like a global variable, that's way when you update its value, the result of the query also changes.

A better approach is move your query to a method:

public static IEnumerable<string> Pairs(char[] chrs,char[] chrs2)
{
      return from ch1 in chrs
             from ch2 in chrs2
             select ch1+" "+ ch2;
}

That way you can do something like this:

 static void Main(string[] args)
 {
        char[] chrs = { 'A', 'B', 'C' };
        char[] chrs2 = { 'X', 'Y', 'Z' };

        Console.WriteLine("For ABC and XYZ: ");
        foreach (var p in Pairs(chrs,chrs2))
            Console.WriteLine(p);
        Console.WriteLine();

        Console.WriteLine("For D and W: ");
        chrs = new char[] { 'D' };
        chrs2 = new char[] { 'W' };
        foreach (var p in Pairs(chrs, chrs2))
            Console.WriteLine(p);
}

Upvotes: 0

Related Questions