Tan Yu Hau Sean
Tan Yu Hau Sean

Reputation: 625

How can i create a find method and a insert method for O(1) time complexity in a custom HashMap implementation?

I am making a custom implementation of a HashMap without using the HashMap data structure as part of an assignment, currently I have the choice of working with two 1D arrays or using a 2D array to store my keys and values. I want to be able to check if a key exists and return the corresponding value in O(1) time complexity (assignment requirement) but i am assuming it is without the use of containsKey().

Also, when inserting key and value pairs to my arrays, i am confused because it should not be O(1) logically, since there would occasionally be cases where there is collision and i have to recalculate the index, so why is the assignment requirement for insertion O(1)?

Upvotes: 0

Views: 179

Answers (2)

Joni
Joni

Reputation: 111349

You're correct in that hash table insert is not guaranteed to be O(1) due to collisions. If you use the open addressing strategy to deal with collisions, the process to insert an item is going to take time proportional to 1/(1-a) where a is the proportion of how much of the table capacity has been used. As the table fills up, a goes to 1, and the time to insert grows without bound.

The secret to keeping time complexity as O(1) is making sure that there's always room in the table. That way a never grows too big. That's why you have to resize the table when it starts to run out capacity.

Problem: resizing a table with N item takes O(N) time.

Solution: increase the capacity exponentially, for example double it every time you need to resize. This way the table has to be resized very rarely. The cost of the occasional resize operations is "amortized" over a large number of insertions, and that's why people say that hash table insertion has "amortized O(1)" time complexity.


TLDR: Make sure you increase the table capacity when it's getting full, maybe 70-80% utilization. When you increase the capacity, make sure that it's by a constant factor, for example doubling it.

Upvotes: 1

Sean Patrick Floyd
Sean Patrick Floyd

Reputation: 299068

A lot of questions in there, let me give it a try.

I want to be able to check if a key exists and return the corresponding value in O(1) time complexity (assignment requirement) but i am assuming it is without the use of containsKey().

That actually doesn't make a difference. O(1) means the execution time is independent of the input, it does not mean a single operation is used. If your containsKey() and put() implementations are both O(1), then so is your solution that uses both of them exactly once.

Also, when inserting key and value pairs to my arrays, i am confused because it should not be O(1) logically, since there would occasionally be cases where there is collision and i have to recalculate the index, so why is the assignment requirement for insertion O(1)?

O(1) is the best case, which assumes that there are no hash collisions. The worst case is O(n) if every key generates the same hash code. So when a hash map's lookup or insertion performance is calculated as O(1), that assumes a perfect hashCode implementation.

Finally, when it comes to data structures, the usual approach is to use a single array, where the array items are link list nodes. The array offsets correspond to hashcode() % array size (there are much more advances formulas than this, but this is a good starting point). In case of a hash collision, you will have to navigate the linked list nodes until you find the correct entry.

Upvotes: 2

Related Questions