Reputation: 8009
I want to build a tree structure like this:
root Id 1
child id 2
grandChild id 3
Code sample below. If I use GetChildrenNodesCorrect(),
I get the correct result. But when GetChildrenNodesWrong()
is used, it returns below:
root Id 1
child id 2
Null
I know that ToList()
is not deferred execution, and returns result immediatelly. Could anyone explain this?
public class ToListTest
{
public static void Entry()
{
var toListTest = new ToListTest();
toListTest.Test();
}
public void Test()
{
List<Node> newsList = new List<Node>
{
new Node{Id = 1, ParentId = 0},
new Node{Id = 2, ParentId = 1},
new Node{Id = 3, ParentId = 2}
};
var root = BuildUpTree(newsList);
}
private TreeNode BuildUpTree(List<Node> newsList)
{
var root = new TreeNode { currentNode = newsList.First(n => n.ParentId == 0) };
BuildUpTreeChildrenNodes(newsList, root);
return root;
}
private void BuildUpTreeChildrenNodes(List<Node> newsList, TreeNode currentTreeNode)
{
currentTreeNode.Children = GetChildrenNodesWrong(newsList, currentTreeNode);
foreach (var node in currentTreeNode.Children)
{
BuildUpTreeChildrenNodes(newsList, node);
}
}
private IEnumerable<TreeNode> GetChildrenNodesWrong(List<Node> newsList, TreeNode cuurentNode)
{
return newsList.Where(n => n.ParentId == cuurentNode.currentNode.Id)
.Select(n => new TreeNode
{
currentNode = n
});
}
private IEnumerable<TreeNode> GetChildrenNodesCorrect(List<Node> newsList, TreeNode cuurentNode)
{
return GetChildrenNodesWrong(newsList, cuurentNode).ToList();
}
public class TreeNode
{
public Node currentNode { get; set; }
public IEnumerable<TreeNode> Children { get; set; }
}
public class Node
{
public int Id { get; set; }
public int ParentId { get; set; }
}
}
Update
In debug, when using GetChildrenNodesWrong(),
root has both child and grandchild before the method returns. After the method returns, root has only child, and grandchild is null.
Update 2
IMO, the problem might not be related to clean code. But anyone is welcome to show more intuitive code.
Upvotes: 3
Views: 509
Reputation: 48076
I'm not entirely sure what you're asking. All LINQ queries have deferred execution, when you call ToList()
you're simply forcing the query to execute. I think the main problem is in your where clause. Only two objects satisfy the condition so the IEnumerable returned by your LINQ query should only have 2 objects.
It's not doing what you expect because the LINQ query in GetChildrenNodesWrong
is producing an "off by one" error. Here is basically what happens;
1) we feed it root for n = root nothing happens. We move to the next node.
2) n.Id = 1, the where condition is met by node 2 as it's parentId is 1. We allocate a new node, point current to node 2
3) We get to the third node now. n.ParentId = 2 and current.Id = 2. We have a match so we allocated another node and point current to node 3.
4) we're at the end of the list. Grand child is never allocated because we're off by one.
Basically you iterate x time where x is the length of the list but because current = n on the first iteration you don't allocate a node so you end up with x -1 nodes when you're expecting x.
Upvotes: 0
Reputation: 16324
Every time the IEnumerable
is evaluated the Linq query is re-executed. So, when you're computing the tree, it is allocating space for nodes but not assigning them to any permanent variable. This means that in the foreach
loop in BuildUpTreeChildrenNodes
, you are not calling the recursive function on the instance of the node you want. Instead, you're calling it on a re-instantiated version of the node that has been created by the foreach
loop (which enumerates the IEnumerable
). When you call ToList
on the IEnumerable
instead, then the foreach
loop will return the elements of the list, which is in memory.
If you make root
public static, and then debug your code, you'll see that when you call BuildUpTreeChildrenNodes
, the node
argument is not the instance of the node that you want. Even though it has the same ID and represents the same node in the graph, it is not actually connected in any way to the root node. Check:
root.Children.Any(n => n.Id == node.Id) //true
root.Children.Contains(node) //false
The simplest way to see your problem is here:
//Create a singleton Node list:
var nodeSingleton= Enumerable.Range(0, 1).Select(x => new Node { Id = x });
Console.Write(nodeSingleton.Single() == nodeSingleton.Single());
You might expect this to return true
, but in fact it will be false
- both times the Single
Linq method is called, the deferred execution of the singleton variable is re-evaluated, and returns a different instance of the Node
class.
If you call ToList
on the singleton, however, then you get the list in memory and the Single
method will return the same instance of the Node
.
More broadly, I think the problem with this code is that it mixes up imperative and functional code too much. It is strange that so many of the methods are void
and then the GetChildrenNodesWrong
method is not. I think you should pick a style and stick with it, since switching paradigms can be confusing.
Upvotes: 1