Reputation: 11
Writing Huffman Coding Algorithm in Python. Have successfully managed to create a tree based on a string input but am stuck on the best way to traverse it while generating the codes for each letter.
from collections import Counter
class HuffNode:
def __init__(self, count, letter=None):
self.letter = letter
self.count = count
self.right = None
self.left = None
word = input()
d = dict(Counter(word))
Nodes = [HuffNode(d[w], w) for w in sorted(d, key=d.get, reverse=True)]
while len(Nodes) > 1:
a = Nodes.pop()
b = Nodes.pop()
c = HuffNode(a.count+b.count)
c.left, c.right = a, b
Nodes.append(c)
Nodes.sort(key=lambda x: x.count, reverse=True)
d = dict(Counter(word))
would get the frequency of each letter in the string and convert it to a dict. Thus having {'e': 1, 'l': 2, 'h': 1, 'o': 1}
Each letter if then converted to a HuffNode
and stored in Nodes
The while
loop then proceeds to generate a tree until we only have one root
When the loop exits I'll have: Whats the best way to traverse this tree then generating the codes for each letter? Thanks
Upvotes: 1
Views: 2642
Reputation: 766
Generally speaking, you would want a recursive function that, given a HuffNode
h and a prefix p:
h.letter
is not empty (i.e. h is a leaf), yields (p, h.letter)
-> this is the code for the letterh.left
with prefix p + '0'
and on h.right
with p + '1'
A possible implementation (not tested, may have typos):
def make_code(node, prefix):
if node is None:
return []
if node.letter is not None:
return [(prefix, node.letter)]
else:
result = []
result.extend(make_code(h.left, prefix + '0'))
result.extend(make_code(h.right, prefix + '1'))
return result
codes = make_code(root, '')
where root
is the Huffman tree you built in the first step. The first test (if node is None
) is to manage lopsided nodes, where one of the children may be empty.
Upvotes: 0