Reputation: 2914
I was reading this article about LINQ and can't understand how the query is executed in terms of lazy evaluation.
So, I simplified the example from the article to this code:
void Main()
{
var data =
from f in GetFirstSequence().LogQuery("GetFirstSequence")
from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
select $"{f} {s}";
data.Dump(); // I use LINQPAD to output the data
}
static IEnumerable<string> GetFirstSequence()
{
yield return "a";
yield return "b";
yield return "c";
}
static IEnumerable<string> GetSecondSequence()
{
yield return "1";
yield return "2";
}
public static class Extensions
{
private const string path = @"C:\dist\debug.log";
public static IEnumerable<string> LogQuery(this IEnumerable<string> sequence, string tag, string element = null)
{
using (var writer = File.AppendText(path))
{
writer.WriteLine($"Executing query {tag} {element}");
}
return sequence;
}
}
After executing this code, I have in debug.log file the output that can be logically explained:
Executing query GetFirstSequence
Executing query GetSecondSequence a
Executing query GetSecondSequence b
Executing query GetSecondSequence c
The things got strange when I want to interleave first three element with last three elements like this:
void Main()
{
var data =
from f in GetFirstSequence().LogQuery("GetFirstSequence")
from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
select $"{f} {s}";
var shuffle = data;
shuffle = shuffle.Take(3).LogQuery("Take")
.Interleave(shuffle.Skip(3).LogQuery("Skip")).LogQuery("Interleave");
shuffle.Dump();
}
Sure I need to add extension method to interleave two sequences (gotten from the above mentioned article):
public static IEnumerable<string> Interleave(this IEnumerable<string> first, IEnumerable<string> second)
{
var firstIter = first.GetEnumerator();
var secondIter = second.GetEnumerator();
while (firstIter.MoveNext() && secondIter.MoveNext())
{
yield return firstIter.Current;
yield return secondIter.Current;
}
}
After executing these lines of code I get the following output in my txt file:
Executing query GetFirstSequence
Executing query Take
Executing query Skip
Executing query Interleave
Executing query GetSecondSequence a
Executing query GetSecondSequence a
Executing query GetSecondSequence b
Executing query GetSecondSequence c
Executing query GetSecondSequence b
and that makes me embarrassed because I don't understand the sequence in which my query is executing.
Why the query has been executed this way?
Upvotes: 3
Views: 5913
Reputation: 91
var data =
from f in GetFirstSequence().LogQuery("GetFirstSequence")
from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
select $"{f} {s}";
is just another way of writing
var data = GetFirstSequence()
.LogQuery("GetFirstSequence")
.SelectMany(f => GetSecondSequence().LogQuery("GetSecondSequence", f), (f, s) => $"{f} {s}");
Let's step through the code:
var data = GetFirstSequence() // returns an IEnumerable<string> without evaluating it
.LogQuery("GetFirstSequence") // writes "GetFirstSequence" and returns the IEnumerable<string> from its this-parameter without evaluating it
.SelectMany(f => GetSecondSequence().LogQuery("GetSecondSequence", f), (f, s) => $"{f} {s}"); // returns an IEnumerable<string> without evaluating it
var shuffle = data;
shuffle = shuffle
.Take(3) // returns an IEnumerable<string> without evaluating it
.LogQuery("Take") // writes "Take" and returns the IEnumerable<string> from its this-parameter without evaluating it
.Interleave(
shuffle
.Skip(3) // returns an IEnumerable<string> without evaluating it
.LogQuery("Skip") // writes "Skip" and returns the IEnumerable<string> from its this-parameter without evaluating it
) // returns an IEnumerable<string> without evaluating it
.LogQuery("Interleave"); // writes "Interleave" and returns the IEnumerable<string> from its this-parameter without evaluating it
The code so far is responsible for the first four lines of output:
Executing query GetFirstSequence Executing query Take Executing query Skip Executing query Interleave
None of the IEnumerable<string> have been evaluated yet.
Finally, shuffle.Dump()
iterates over shuffle
and thus evaluates the IEnumerables.
Iterating over data
prints the following, because SelectMany()
calls GetSecondSequence()
and LogQuery()
for each element in GetFirstSequence()
:
Executing query GetSecondSequence a Executing query GetSecondSequence b Executing query GetSecondSequence c
Iterating over shuffle
is the same as iterating over
Interleave(data.Take(3), data.Skip(3))
Interleave()
interleaves the elements from two iterations over data
and thus also interleaves the output caused by iterating over them.
firstIter.MoveNext();
// writes "Executing query GetSecondSequence a"
secondIter.MoveNext();
// writes "Executing query GetSecondSequence a"
// skips "a 1" from second sequence
// skips "a 2" from second sequence
// writes "Executing query GetSecondSequence b"
// skips "b 1" from second sequence
yield return firstIter.Current; // "a 1"
yield return secondIter.Current; // "b 2"
firstIter.MoveNext();
secondIter.MoveNext();
// writes "Executing query GetSecondSequence c"
yield return firstIter.Current; // "a 2"
yield return secondIter.Current; // "c 1"
firstIter.MoveNext();
// writes "Executing query GetSecondSequence b"
secondIter.MoveNext();
yield return firstIter.Current; // "b 1"
yield return secondIter.Current; // "c 2"
Upvotes: 5