user20210310
user20210310

Reputation: 112

Hash Map implementation in python

So I am trying to implement dict from scratch considering collision ,too which i handled through the separate chaining. I have added some key value pairs from the .csv file to count most occurring word. But I am struggling to add up values since it stores the same key several times in the same index in spite of handling this(see the code hash_table[word] += 1.....). I have tried debugging still could not pinpoint the issue.

class HashMap:
    def __init__(self):
        self.max = 50
        self.array = [[] for item in range(self.max)]

    
    def hash(self, key):
        hash_values = 0
        if len(key) == 1:
            hash_values = ord(key)
        else:
            for char in key:
                hash_values += ord(char)
        
        return hash_values % 50

    
    def __setitem__(self, key, values):
        hash = self.hash(key)
        for idx, item in enumerate(self.array[hash]):
            if len(item) == 2 and item[0] == key:
                self.array[hash][idx][1] =+ 1
                break
        self.array[hash].append([key, values])

    
    def __getitem__(self, key):
        h = self.hash(key)
        for idx, item in enumerate(self.array[h]):
            if item[0] == key:
                return item[1]
              

    def __str__(self):
        return f"{self.array}"

hash_table = HashMap()

word_dict = []
with open("poem.csv", "r") as csv_file:
    for line in csv_file:
        words = line.split(" ")
        for word in words:
            word = word.replace('\n','')
            if hash_table[word]:
                hash_table[word] += 1
                break
            hash_table[word] = 1

.csv:

Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

and

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

Output:

[[['And', 1], ['sorry', 1], ['And', 2], ['And', 2], ['bent', 1], ['', 1], ['', 2], ['And', 2]], [], [], [], [['travel', 1], ['both', 1], ['just', 1], ['worn', 1]], [['them', 1]], [], [['and', 1]], [], [], [['wood,', 1], ['could', 1]], [], [['roads', 1], ['not', 1], ['as', 1], ['as', 2], ['as', 2]]......]

As you noticed, HashMap stored identical keys in the same index several time.

Upvotes: 0

Views: 580

Answers (1)

Tim Roberts
Tim Roberts

Reputation: 54777

You have two problems. First, in __setitem__, you have =+ 1. What that says is "set the value of this item to +1". You want += 1. C has both, Python only has the one.

Second, in __setitem__, if you FIND the item, you modify the entry, then you break out of the loop, which then appends ANOTHER item to the list. You should return, not break.

With those two fixes, your code works for me.

Upvotes: 3

Related Questions