Reputation: 8166
I'm using HtmlAgilityPack to parse a page of HTML and retrieve a number of option elements from a select list.
The GvsaDivisions
is a method that returns raw html from the result of a POST, irreverent in the context of the question
public IEnumerable<SelectListItem> Divisions(string season, string gender, string ageGroup)
{
var document = new HtmlDocument();
var html = GvsaDivisions(season);
document.LoadHtml(html);
var options = document.DocumentNode.SelectNodes("//select//option").Select(x => new SelectListItem() { Value = x.GetAttributeValue("value", ""), Text = x.NextSibling.InnerText });
var divisions = options.Where(x => x.Text.Contains(string.Format("{0} {1}", ageGroup, gender)));
if (ageGroup == "U15/U16")
{
ageGroup = "U15/16";
}
if (ageGroup == "U17/U19")
{
ageGroup = "U17/19";
}
return divisions;
}
What I'm observing is this... once the options.Where()
is executed, divisions contains a single result. After the test of ageGroup == "U15/U16"
and the assignment of ageGroup = "U15/16"
, divisions now contains 3 results (the original 1, with the addition of 2 new matching the criteria of the new value of ageGroup
Can anybody explain this anomaly? I expected to make a call to Union the result of a new Where query to the original results, but it seems it's happening automagically. While the results are what I desire, I have no way to explain how it's happening (or the certainty that it'll continue to act this way)
Upvotes: 0
Views: 115
Reputation: 18463
LINQ queries use deferred execution, which means they are run whenever you enumerate the result.
When you change a variable that is being used in your query, you actually are changing the result of the next run of the query, which is the next time you iterate the result.
Read more about this here and here:
This is actually by-design, and in many situations it is very useful, and sometimes necessary. But if you need immediate evaluation, you can call the ToList()
method at the end of your query, which materializes you query and gives you a normal List<T>
object.
Upvotes: 6
Reputation: 700
I'm thinking along the same lines as Travis, the delayed execution of linq.
I'm not sure if this will avoid the issue, but I generally put my results into an immediate collection like this. With my experience it seems once you shove the results into a real defined collection I believe it may not be delayed execution.
List<SelectListItem> options = document.DocumentNode.SelectNodes("//select//option").Select(x => new SelectListItem() { Value = x.GetAttributeValue("value", ""), Text = x.NextSibling.InnerText }).Where(x => x.Text.Contains(string.Format("{0} {1}", ageGroup, gender))).ToList<SelectListItem>();
Upvotes: 0
Reputation: 485
The divisions
variable contains an unprocessed enumerator that calls the code x.Text.Contains(string.Format("{0} {1}", ageGroup, gender))
on each element in the list of nodes. Since you change ageGroup before you process that enumerator, it uses that new value instead of the old value.
For example, the following code outputs a single line with the text "pear":
List<string> strings = new List<string> { "apple", "orange", "pear", "watermelon" };
string matchString = "orange";
var queryOne = strings.Where(x => x == matchString);
matchString = "pear";
foreach (var item in queryOne)
{
Console.WriteLine(" " + item);
}
Upvotes: 1