Reputation: 10923
I am creating a tree structure out of a text file input using a function from SO question: Python file parsing: Build tree from text file. But I am able to produce my tree only by using a global variable and cannot find a way of avoiding this.
In a file called data.txt
I have the following:
Root
-A 10
-B
--B A 2
--B B 5
--B Z 9
--B X
---X 4
---Y
----Y0 67
----Y1 32
---Z 3
-C 19
{'B': ['B A 2', 'B B 5', 'B Z 9', 'B X'],
'B X': ['X 4', 'Y', 'Z 3'],
'Root': ['A 10', 'B', 'C 19'],
'Y': ['Y0 67', 'Y1 32']}
import re, pprint
PATTERN = re.compile('^[-]+')
tree = {}
def _recurse_tree(parent, depth, source):
last_line = source.readline().rstrip()
while last_line:
if last_line.startswith('-'):
tabs = len( re.match(PATTERN, last_line).group() )
else:
tabs = 0
if tabs < depth:
break
node = re.sub(PATTERN, '', last_line.strip())
if tabs >= depth:
if parent is not None:
print "%s: %s" %(parent, node)
if parent in tree:
tree[parent].append(node)
else:
tree[parent] = [ node, ]
last_line = _recurse_tree(node, tabs+1, source)
return last_line
def main():
inFile = open("data.txt")
_recurse_tree(None, 0, inFile)
pprint.pprint(tree)
if __name__ == "__main__":
main()
How do I get rid of the global variable tree
? Everything I do seems to make the code much longer or uglier but I would like to use the function heavily and I hate depending on side-effects for the core result.
After the answers below, I revised the code to return the tree
in the following way. Is this pythonic? Returning a tuple and then tossing the first element seems inelegant.
def _recurse_tree(parent, depth, source, tree=None):
if tree is None:
tree = {}
last_line = source.readline().rstrip()
while last_line:
if last_line.startswith('-'):
tabs = len( re.match(PATTERN, last_line).group() )
else:
tabs = 0
if tabs < depth:
break
node = re.sub(PATTERN, '', last_line.strip())
if tabs >= depth:
if parent is not None:
print "%s: %s" %(parent, node)
if parent in tree:
tree[parent].append(node)
else:
tree[parent] = [ node, ]
last_line, tree = _recurse_tree(node, tabs+1, source, tree)
return last_line, tree
def main():
inFile = open("data.txt")
tmp, tree = _recurse_tree(None, 0, inFile)
pprint.pprint(tree)
Upvotes: 3
Views: 913
Reputation: 1124288
Your tree
variable is already a mutable; just pass it along with your recursive calls:
def _recurse_tree(parent, depth, source, tree=None):
if tree is None:
tree = {}
last_line = source.readline().rstrip()
while last_line:
if last_line.startswith('-'):
tabs = len( re.match(PATTERN, last_line).group() )
else:
tabs = 0
if tabs < depth:
break
node = re.sub(PATTERN, '', last_line.strip())
if tabs >= depth:
if parent is not None:
print "%s: %s" %(parent, node)
if parent in tree:
tree[parent].append(node)
else:
tree[parent] = [ node, ]
last_line = _recurse_tree(node, tabs+1, source, tree)
return last_line
Alternatively, you can use a class to hold the state, it'll be easier to then pluck the state from the instance:
class TreeBuilder(object):
_PATTERN = re.compile('^[-]+')
def __init__(self, source):
self.tree = {}
self.source = source
self._recurse_tree()
def _recurse_tree(self, parent=None, depth=0):
last_line = self.source.readline().rstrip()
while last_line:
if last_line.startswith('-'):
tabs = len( self._PATTERN.match(last_line).group() )
else:
tabs = 0
if tabs < depth:
break
node = self._PATTERN.sub('', last_line.strip())
if tabs >= depth:
if parent is not None:
print "%s: %s" %(parent, node)
if parent in self.tree:
self.tree[parent].append(node)
else:
self.tree[parent] = [ node, ]
last_line = self._recurse_tree(node, tabs+1)
return last_line
Then use this like this:
def main():
inFile = open("data.txt")
builder = TreeBuilder(inFile)
pprint.pprint(builder.tree)
Upvotes: 4
Reputation: 213351
Use tree
as default parameter in your function: -
def _recurse_tree(parent, depth, source, tree = None):
if tree is None: # needed on first invocation
tree = {}
Call it first time without tree
parameter, and on every successive call, add another parameter tree
to it.
So, from inside your method your recursive call becomes: -
last_line = _recurse_tree(node, tabs+1, source, tree)
Upvotes: 0
Reputation: 5022
I think good solution here — create class and put tree inside, like a private class member.
Or you can simply pass this dictionary as one of the parameters in function and pass it during recursion. It'll pass by reference, so all time function will use the same dictionary without global variable. But I'd prefer class.
Upvotes: 1