curtisthibault
curtisthibault

Reputation: 209

LINQ query returns old results when source list is re-initialized

A coworker and I were discussing a situation where an IEnumerable query was returning an old result set after the source list had been re-initialized. Somewhere in the execution of the application the list was being set to null and re-populated with new values. The query itself was never redefined, and continued to return the old results. In fact, it didn't even matter if the source list remained null; the old results were still returned.

Here are some unit tests to demonstrate what we are seeing:

[Test]
public void QueryResultsBasedOnCurrentListEvenAfterUpdate()
{
  var list = new List<string> { "Two", "Three" };
  var query = list.Where(x => x.Length > 3);

  var result1 = query.ToList();

  list.Clear();
  list.AddRange(new List<string> { "Four", "Five", "One" });

  //Correctly gets an updated result set
  var result2 = query.ToList();

  //PASS
  CollectionAssert.AreEquivalent(new List<string> { "Three" }, result1);

  //PASS
  CollectionAssert.AreEquivalent(new List<string> { "Four", "Five" }, result2);
}

[Test]
public void QueryResultsBasedOnCurrentListEvenAfterSetToNullAndReInstantiated()
{
  var list = new List<string> { "Two", "Three" };
  var query = list.Where(x => x.Length > 3);

  var result1 = query.ToList();

  list = null;
  list = new List<string> { "Four", "Five", "One" };

  var result2 = query.ToList();

  //PASS
  CollectionAssert.AreEquivalent(new List<string> { "Three" }, result1);

  //FAIL : result2 == result1.  The query wasn't evaluated against the new list
  CollectionAssert.AreEquivalent(new List<string> { "Four", "Five" }, result2);
}

[Test]
public void QueryExecutionThrowsExceptionWhenListIsSetToNull()
{
  var list = new List<string> { "Two", "Three" };
  var query = list.Where(x => x.Length > 3);

  list = null;

  //FAIL : The query is still evaluated against the original list
  Assert.Throws<ArgumentNullException>(() => query.ToList());
}

It seems that despite Deferred Execution, these queries are still pointing to the original list. As long as the original collection the query was built against remains alive the query correctly evaluates the results. However, if the list is re-instantiated the query remains tied to the original list.

What am I missing? Please explain...

UPDATE:
I'm seeing the same behavior for a query built as an IQueryable. Does an IQueryable also hold a reference to the original list?

Upvotes: 3

Views: 1482

Answers (3)

Eric Lippert
Eric Lippert

Reputation: 660289

I can see how that would be a bit confusing. Here's the deal, succinctly:

  • The "receiver" of the query is treated as a value -- it never changes. If the value is a reference to a list, the value continues to be a reference to that list. The contents of the list might change, but which list does not change.

  • The local variables referred to by clauses of the query are treated as variables -- the latest values of those variables are always used.

Let me give you an analogy to the real world. Suppose you are in your kitchen. You have a drawer labeled "house" and a drawer labeled "name". In the drawer labeled "house" there is a piece of paper that says "1600 Pennsylvania Avenue". In the drawer labeled "name" there is a piece of paper that says "Michelle". When you say:

var house = GetHouse("1600 Pennsylvania Avenue");
var name = "Michelle";
var query = from resident in house.Residents 
            where resident.FirstName == name 
            select resident;

That is like writing the query:

"list all the residents of the White House whose first name is (look at the piece of paper in the drawer marked "name" in my kitchen)"

The values returned by that query depend on (1) who is living in the White House, and (2) what name is written on the piece of paper when the query runs.

It is not like writing the query:

"list all the residents of (look at the piece of paper in the drawer marked "house" in my kitchen) whose first name is (look at the piece of paper in the drawer marked "name" in my kitchen)"

The object against which the query is running is not a variable. The contents of the White House can change. The name that the query is asking about can change. And therefore the results of the query can change in two ways -- with time, and with the value of the name variable. But what house the query is asking about does not ever change no matter what you do to the variable that held the reference. That variable is irrelevant to the query; its value was used to build the query.

Upvotes: 9

Joachim Isaksson
Joachim Isaksson

Reputation: 180987

It's as simple as that query (through an IEnumerable) holds a reference to the collection it was created with.

Changing another variable referencing the same List (ie setting list to null) does not change the reference query already holds.

  • The first test changes the underlying data in the actual list query references, which indeed changes the result.

  • In the second test you create a new list which leaves query still referencing the previous list. Changing list does not change query.

  • The third test only nulls out list which has no effect on the reference query already holds.

Upvotes: 1

Chris Taylor
Chris Taylor

Reputation: 53709

Changing where the reference 'list' points to does not change the original data. When the query expression was written the 'Where' method took it's own reference to the data and will work on that data regardless of where the 'list' variable subsequently points.

This this case, Where gets a new instance of an IEnumerable which references the data that list currently points to, when you then change what list points to, the IEnumerable does not change, it already has it's reference to the data.

Upvotes: 3

Related Questions