Reputation: 308
Given a parent, I would like to get its nodes, of level N (original dataframe is way larger)
child| level| parent|
40| 1| 40|
21| 2| 40|
18| 2| 40|
37| 3| 21|
2| 3| 18|
85| 3| 21|
14| 4| 37|
58| 4| 2|
47| 4| 37|
34| 4| 85|
45| 4| 18|
32| 4| 2|
47| 4| 85|
88| 4| 85|
12| 4| 37|
What I do:
def get_children(x, tree, lst):
for nod in tree[tree['parent'] == x]['child']:
get_children(nod, tree, lst)
lst.append(nod)
and then I filter all nodes of level N.
I would like another way because there are too much recursion calls when the dataframe is larger.
Upvotes: 1
Views: 287
Reputation: 1076
Using an auxiliary data structure you could solve this task in two steps:
Without further assumptions (e.g. that the adjacency list is ordered, and higher level nodes cannot be parents of lower level ones), you would need to cycle through the list at least twice (once to create the parents index, then once more for each lookup). But if the aforementioned assumptions can be made, a single loop would be sufficient.
# determine all predecessors of each node in tree
def index_parents(tree):
parents = {}
for i, n in tree.iterrows():
child = int(n['child'])
parent = int(n['parent'])
# if this child node is seen for the first time, initialize parents
if child not in parents:
parents[child] = set()
# add current parent to predecessor list
parents[child].update(set([parent]))
# add predecessors of parent to predecessor list
parents[child].update(parents[parent])
return parents
# check all nodes at given level if they contain x as predecessor
def get_children(x, tree, level, parents):
x_children = []
for i, n in tree.iterrows():
child = int(n['child'])
if n['level'] == level and x in parents[child]:
# x is in predecessors of child, add it to the result
x_children.append(child)
return x_children
A test given you test data:
>>> parents = index_parents(tree)
>>> get_children(21, tree, 3, parents)
>>> [37, 85]
It may seem a bit backwards, but it avoids the recursion.
Upvotes: 1