Reputation: 630
I need to add the ability to list suffixes to implement our autocomplete feature. To do that, I implemented a function on the TrieNode object that will return all complete word suffixes that exist below it in the trie. For example, if our Trie contains the words ["fun", "function", "factory"] and we ask for suffixes from the f
node, we would expect to receive ["un", "unction", "actory"]
back from node.get_suffixes()
. Here is how I get started:
class TrieNode:
def __init__(self):
## Initialize this node in the Trie
self.word_end = False
self.children = dict()
def insert(self, char):
## Add a child node in this Trie
if not char in self.children:
self.children[char] = TrieNode()
def get_suffixes(self):
pass
I have tested the get_suffixes
function separately and it seemed to work fine.
result = []
def get_suffixes(node, suffix=""):
if not node.children == dict():
for key in node.children:
suffix += key
if node.children[key].word_end:
result.append(suffix)
get_suffixes(node.children[key], suffix)
suffix = suffix[:-1]
return result
How is how I tested the function:
# Create a mock trie for the test
node = TrieNode()
node.insert("A")
node.children["A"].word_end = True
node.children["A"].insert("t")
node.children["A"].children["t"].word_end = True
node.children["A"].insert("b")
node.children["A"].children["b"].insert("a")
node.children["A"].children["b"].children["a"].insert("c")
node.children["A"].children["b"].children["a"].children["c"].insert("a")
node.children["A"].children["b"].children["a"].children["c"].children["a"].word_end = True
node.children["A"].insert("d")
node.children["A"].children["d"].insert("d")
node.children["A"].children["d"].children["d"].word_end = True
node.children["A"].children["d"].insert("m")
node.children["A"].children["d"].children["m"].insert("i")
node.children["A"].children["d"].children["m"].children["i"].insert("n")
node.children["A"].children["d"].children["m"].children["i"].children["n"].word_end = True
result = []
def get_suffixes(node, suffix=""):
if not node.children == dict():
for key in node.children:
suffix += key
if node.children[key].word_end:
result.append(suffix)
get_suffixes(node.children[key], suffix)
suffix = suffix[:-1]
return result
get_suffixes(node.children["A"]) # Returns ['t', 'baca', 'dd', 'dmin'], as expected
The problem occured when I tried moving the get_suffixes
function to the TrieNode
class. Here I do not know how I should tackle the global variable result
. It is not supposed to be a global variable anymore. I have tried two versions:
Version I: make result
a class attribute
class TrieNode:
def __init__(self):
## Initialize this node in the Trie
self.word_end = False
self.children = dict()
self.result = []
def insert(self, char):
## Add a child node in this Trie
if not char in self.children:
self.children[char] = TrieNode()
def get_suffixes(self, suffix=""):
if not self.children == dict():
for key in self.children:
suffix += key
if self.children[key].word_end:
self.result.append(suffix)
self.children[key].get_suffixes(suffix)
suffix = suffix[:-1]
return self.result
node.children["A"].get_suffixes() # Returns ['t'], which is wrong
Version II: make result
a default function parameter
class TrieNode:
def __init__(self):
## Initialize this node in the Trie
self.word_end = False
self.children = dict()
def insert(self, char):
## Add a child node in this Trie
if not char in self.children:
self.children[char] = TrieNode()
def suffixes(self, suffix="", result=[]):
if not self.children == dict():
for key in self.children:
suffix += key
if self.children[key].word_end:
result.append(suffix)
self.children[key].suffixes(suffix)
suffix = suffix[:-1]
return result
node.children["A"].suffixes() # Returns ['t', 'baca', 'dd', 'dmin']
node.children["A"].suffixes() # Returns ['t', 'baca', 'dd', 'dmin', 't', 'baca', 'dd', 'dmin']
The result of Version II is not surprising because:
def append(number, number_list=[]):
number_list.append(number)
print(number_list)
return number_list
append(5) # expecting: [5], actual: [5]
append(7) # expecting: [7], actual: [5, 7]
append(2) # expecting: [2], actual: [5, 7, 2]
I am learning algorithms and data structure in Python. I was asked to do it using a recursive function. Other approaches such as Implementing a Trie to support autocomplete in Python are not the answers I expect though they themselves might be able to solve the problem. I am extremely curious why self.result
is not properly modified in Version I but works properly if it does not reside in a class.
Upvotes: 2
Views: 522
Reputation: 1062
result
belongs to the class TrieNode
.
When you return self.result
from the get_suffixes
method, you are only including the answers found in the current TrieNode
Instance.
You need to include the answers found by its children as well. Thanks to recursion the code just needs a minor change and adding self.result+=self.children[key].get_suffixes(suffix)
makes everything work.
class TrieNode:
def __init__(self):
## Initialize this node in the Trie
self.word_end = False
self.children = dict()
self.result = []
def insert(self, char):
## Add a child node in this Trie
if not char in self.children:
self.children[char] = TrieNode()
def get_suffixes(self, suffix=""):
if not self.children == dict():
for key in self.children:
suffix += key
if self.children[key].word_end:
self.result.append(suffix)
else:
self.result+=self.children[key].get_suffixes(suffix)
suffix = suffix[:-1]
return self.result
# Create a mock trie for the test
node = TrieNode()
node.insert("A")
node.children["A"].word_end = True
node.children["A"].insert("t")
node.children["A"].children["t"].word_end = True
node.children["A"].insert("b")
node.children["A"].children["b"].insert("a")
node.children["A"].children["b"].children["a"].insert("c")
node.children["A"].children["b"].children["a"].children["c"].insert("a")
node.children["A"].children["b"].children["a"].children["c"].children["a"].word_end = True
node.children["A"].insert("d")
node.children["A"].children["d"].insert("d")
node.children["A"].children["d"].children["d"].word_end = True
node.children["A"].children["d"].insert("m")
node.children["A"].children["d"].children["m"].insert("i")
node.children["A"].children["d"].children["m"].children["i"].insert("n")
node.children["A"].children["d"].children["m"].children["i"].children["n"].word_end = True
print(node.children["A"].get_suffixes())
Output:-
['t', 'baca', 'dd', 'dmin']
The thing to remember is that every child is a new instance of the TrieNode
class and thus has its own separate result
array.
Modified Insertion + No Result Array:-
class TrieNode:
def __init__(self):
## Initialize this node in the Trie
self.word_end = False
self.children = dict()
def insert(self, string):
if len(string) == 0:
self.word_end = True
return
## Add a child node in this Trie
if not string[0] in self.children:
self.children[string[0]] = TrieNode()
self.children[string[0]].insert(string[1:])
def get_suffixes(self, suffix=""):
query_result=[]
if self.word_end:
query_result.append(suffix)
for i in self.children:
query_result+=self.children[i].get_suffixes(suffix+i)
return query_result
# Create a mock trie for the test
node = TrieNode()
node.insert("Add")
node.insert("At")
node.insert("Abaca")
node.insert("Admin")
print(node.children["A"].get_suffixes())
print(node.children["A"].get_suffixes())
print(node.children["A"].children["t"].get_suffixes())
Output:-
['dd', 'dmin', 't', 'baca']
['dd', 'dmin', 't', 'baca']
['']
[Finished in 0.0s]
Upvotes: 2