Reputation: 2773
I have the following tree data structure:
class TreeNode:
def __init__(self, path, value=None):
self.path = path
self.value = value
self.children = []
Where path
is a directory or filename. If the the value of path
is a file (a leaf in the tree), then the value is an integer, otherwise it is None
. Here's an example of this tree structure:
└── root (None)
├── dir0 (None)
| ├── dir00 (None)
| | └── file000.txt (10)
| └── file00.txt (10)
├── dir1 (None)
| └── file10.txt (5)
├── dir2 (None)
| ├── file20.txt (10)
| └── file21.txt (15)
└── dir3 (None)
├── dir30 (None)
| └── file300.txt (15)
└── file30.txt (10)
I am trying to return the smallest possible flattened list of resolved paths and their associated value
. If all of the nodes in a subtree have the same value
, then we say that such a subtree has the same value
as its nodes. Essentially, the value
for each node bubbles up to its parent if all children of the parent have the same value
.
For example, what should be returned with the above tree is:
/root/dir0: 10
/root/dir1: 5
/root/dir2/file20.txt: 10
/root/dir2/file21.txt: 15
/root/dir3/dir30: 15
/root/dir3/file30.txt: 10
I've tried a couple different ways to accomplish this: traversing the tree with a stack, traversing the tree with recursion, and using sets; all have been unsuccessful. My most recent attempt's pseudocode looks like:
def build_list(self, treenode):
if treenode.value:
return [(treenode.path, treenode.value)]
if treenode.value == None:
s = set()
for child in treenode.children:
potential_values = self.build_list(child)
for val in potential_values:
s |= {val[1]}
if len(s) == 1:
return [(treenode.path, treenode.value)]
else:
return [(child.path, child.value) for child in treenode.children]
How would I accomplish this? Pseudocode is totally fine, I'm looking for an approach, not necessarily a full implementation.
Upvotes: 1
Views: 780
Reputation: 543
Here's a way to do it. I'm using a different node class to make things a little easier to setup:
class TreeNode:
def __init__(self, node_id, value=None):
self.node_id = node_id
self.value = value
self.children = []
def addChildren(self, *node_list):
self.children += node_list
def __repr__(self):
return f'<TreeNode(node_id={self.node_id}, value={self.value})'
The __repr__()
is just there to get a human-readable representation of an instance when printing.
I attempted to set up the tree to mirror your example, but feel free to let me know if it's not exactly right. (It could be done more succinctly, but I've written one line per object for clarity.)
root = TreeNode('root')
dir0 = TreeNode('dir0')
dir00 = TreeNode('dir00')
file00 = TreeNode('file00', 10)
file000 = TreeNode('file000', 10)
dir1 = TreeNode('dir1')
file10 = TreeNode('file10', 5)
dir2 = TreeNode('dir2')
file20 = TreeNode('file20', 10)
file21 = TreeNode('file21', 15)
dir3 = TreeNode('dir3')
dir30 = TreeNode('dir30')
file30 = TreeNode('file30', 10)
file300 = TreeNode('file300', 15)
root.addChildren(dir0, dir1, dir2, dir3)
dir0.addChildren(dir00, file00)
dir00.addChildren(file000)
dir1.addChildren(file10)
dir2.addChildren(file20, file21)
dir3.addChildren(dir30, file30)
dir30.addChildren(file300)
Now for the recusive function:
def buildList(node):
# Return node id and value if no children
if not node.children:
return [(node.node_id, node.value)]
# Call buildList on each child and get distinct values
childitems = [item for child in node.children for item in buildList(child)]
childvalues = set(childitem[1] for childitem in childitems)
# If value is unique, return this node as if it has the unique value
if len(childvalues) == 1:
return [(node.node_id, childvalues.pop())]
# Otherwise, return all results
return childitems
This produces the following output:
>>> buildList(root)
[('dir0', 10),
('dir1', 5),
('file20', 10),
('file21', 15),
('dir30', 15),
('file30', 10)]
[Edit] Note that this returns a list and doesn't mutate the nodes.
Regarding your particular implementation, you might want to think about what happens if there's an empty directory. Is that allowed? If so, it's a leaf but not a file. Will it have a value in that case?
Upvotes: 0
Reputation: 3407
Method 1
Recursion should work:
def recurse_update_value(treenode):
if not treenode.children:
return
for child in treenode.children:
recurse_update_value(child)
if all(x.value==treenode.children[0].value for x in treenode.children):
treenode.value = treenode.children[0].value
treenode.children = []
Method 2
Additionally, if you can edit the TreeNode class, you can set the getter method to automatically update the children.
class TreeNode:
def __init__(self, path, value=None):
self.path = path
self._value = value
self.children = []
@property
def value(self)
if not self.children:
return self._value
first_child_value = self.children[0].value
if all(x.value==first_child_value for x in self.children)
self._value = first_child_value
self.children = []
return self._value
Then, you simply have to call topnode.value
to update the tree.
Upvotes: 3