Reputation: 953
Using tree = lambda: dedfaultdict(tree)
, I can replace the following code:
from collections import defaultdict
END = '$'
words = ['hi', 'hello', 'hiya', 'hey']
root = {}
for word in words:
node = root
for ch in word:
node = node.setdefault(ch, {}) # <---- Code that can be replaced
node[END] = None
with:
from collections import defaultdict
END = '$'
words = ['hi', 'hello', 'hiya', 'hey']
tree = lambda: defaultdict(tree)
root = tree()
for word in words:
node = root
for ch in word:
node = node[ch] # <------ Replaced code
node[END] = None
What I actually want is for each dictionary node to have a backreference to its parent dictionary node. I can do this as follows:
from collections import defaultdict
BACKREF, END = 'BACKREF', '$'
words = ['hi', 'hello', 'hiya', 'hey']
root = {}
for word in words:
node = root
for ch in word:
node = node.setdefault(ch, {BACKREF: node}) # <---- Code I want to replace
node[END] = None
(proof that this works: link)
So, given that I was able to use tree = lambda: defaultdict(tree)
to replace
node = node.setdefault(ch, {})
node = node[ch]
is there a way I can use a modified version of tree = lambda: default(tree)
to replace
node = node.setdefault(ch, {BACKREF: node})
node = node[ch]
?I have tried something like:
def tree():
_ = defaultdict(tree)
_[BACKREF] = ?
return _
root = tree()
h = root['h']
But that would require tree
to know which dictionary invoked the call to tree
. E.g. in h = root['h']
, root['h']
invokes a call to tree
because h
is not yet in root
. tree
would have to know it was invoked via the call root['h']
so that it could do h[BACKREF] = root
. Is there a way around this? Is it a bad idea even if it can be done?
I know the backreferences technically means that the trie will have cycles (and not be truly a tree), but the way I plan to traverse the trie, this won't be an issue. The reason I want backreferences is because it will be useful if I want to delete a word from the trie. For example, say I have the following trie:
and that I am at root['h']['e']['l']['l']['o']
and want to delete 'hello'
from the trie. I can do this by backtracking up the trie from root['h']['e']['l']['l']['o']
to root['h']['e']['l']['l']
to root['h']['e']['l']
to root['h']['e']
(I stop here because len(set(root['h']['e'].keys()) - {BACKREF}) > 1
. Then I can simply do del root['h']['e']['l']
and I will have cleaved off the 'llo$'
from 'he'
meaning the trie will still have 'hey'
. Although there are alternatives, backtracking up the trie will be very easy with the backreferences.
tree = lambda: defaultdict(tree)
Using:
from collections import defaultdict
tree = lambda: defaultdict(tree)
root = tree()
one can create arbitrarily nested dict
s. E.g. after:
root['h']['i']
root['h']['e']['l']['l']['o']
root['h']['i']['y']['a']
root['h']['e']['y']
root
will look like:
{'h': {'i': {'y': {'a': {}}}, 'e': {'y': {}, 'l': {'l': {'o': {}}}}}}
This represents a tree that looks like this:
visualized using https://www.cs.usfca.edu/~galles/visualization/Trie.html
Upvotes: 1
Views: 212
Reputation: 51063
The behaviour you're trying to implement seems easier to write as a class rather than a function.
from collections import defaultdict
class Tree(defaultdict):
def __init__(self, backref=None):
super().__init__(self.make_child)
self.backref = backref
def make_child(self):
return Tree(backref=self)
Usage:
>>> root = Tree()
>>> root['h'].backref is root
True
>>> root['h']['e'].backref is root['h']
True
Upvotes: 2
Reputation: 10239
Giving the same {BACKREF: node}
to defaultdict
:
from collections import defaultdict
BACKREF, END = 'BACKREF', '$'
words = ['hi', 'hello', 'hiya', 'hey']
tree = lambda: defaultdict(tree, {BACKREF: node})
node = None
root = tree()
for word in words:
node = root
for ch in word:
node = node[ch]
node[END] = None
The root
node has a backref None
, can be deleted if bothersome.
The above works fine if that code is the only code that creates tree nodes (seems likely to me, judging by the times I've built such trees myself). Otherwise you'd need to ensure that node
points to the correct parent node. If that's an issue, here's an alternative with a dict (not defaultdict) subclass that implements __missing__
to automatically create children with backrefs when needed:
BACKREF, END = 'BACKREF', '$'
words = ['hi', 'hello', 'hiya', 'hey']
class Tree(dict):
def __missing__(self, key):
child = self[key] = Tree({BACKREF: self})
return child
root = Tree()
for word in words:
node = root
for ch in word:
node = node[ch]
node[END] = None
Also doesn't give the root a backref, and being a dict, its string representations are far less cluttered than a defaultdict's and thus far more readable:
>>> import pprint
>>> pprint.pp(root)
{'h': {'BACKREF': <Recursion on Tree with id=2494556270320>,
'i': {'BACKREF': <Recursion on Tree with id=2494556270400>,
'$': None,
'y': {'BACKREF': <Recursion on Tree with id=2494556270480>,
'a': {'BACKREF': <Recursion on Tree with id=2494556340608>,
'$': None}}},
'e': {'BACKREF': <Recursion on Tree with id=2494556270400>,
'l': {'BACKREF': <Recursion on Tree with id=2494556340288>,
'l': {'BACKREF': <Recursion on Tree with id=2494556340368>,
'o': {'BACKREF': <Recursion on Tree with id=2494556340448>,
'$': None}}},
'y': {'BACKREF': <Recursion on Tree with id=2494556340288>,
'$': None}}}}
The defaultdict result for comparison:
>>> pprint.pp(root)
defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': None,
'h': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930855152>,
'i': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930855312>,
'$': None,
'y': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930912832>,
'a': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930913232>,
'$': None})})}),
'e': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930855312>,
'l': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930912912>,
'l': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930912992>,
'o': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930913072>,
'$': None})})}),
'y': defaultdict(<function <lambda> at 0x000001A13760BE50>,
{'BACKREF': <Recursion on defaultdict with id=1791930912912>,
'$': None})})})})
Upvotes: 2