Josh l
Josh l

Reputation: 47

Why is a hash map get/set considered to have O(1) complexity?

Assume we have the following hash map class in Javascript:

class myHash {
    constructor() {
        this.list = [];
    }
    hash(input) {
        var checksum = 0;
        for (var i = 0; i < input.length; i++) {
            checksum += input.charCodeAt(i);
        }
        return checksum;
    }
    get(input) {
        return (this.list[this.hash(input)]);
    }
    set(input, value) {
        this.list[this.hash(input)] = value;
    }
}

The hash function has a loop which has a complexity of O(n) and is called during getters and setters. Doesn't this make the complexity of the hash map O(n)?

Upvotes: 3

Views: 231

Answers (2)

apple apple
apple apple

Reputation: 10591

Yes, the string size (k) does matter. (more precisely, the hash function complexity)


Assume:

  • Get item use array index takes f(n) time
  • The hash function takes g(k) time

then the complexity is O( f(n)+g(k) ).

We know that g(k) is O(k), and if we assume f(n) is O(1), the complexity becomes O(k)

Furthermore, if we assume the string size k would not great than a constant c, the complexity becomes O(c) which can be rewrite as O(1).


So in fact, given your implementation, the O(1) is correct bound only if

  • Get item use array index takes O(1)
  • The string would not longer than a constant c

Notes

  • Some hash function may itself be O(1), like simply take the first character or the length.
  • Whether get item use array index takes O(1) should be check, for example in javascript sparse array may take longer to access.

Upvotes: 0

John Kugelman
John Kugelman

Reputation: 361729

When you're performing Big-O analysis you need to be very clear what the variables are. Oftentimes the n is left undefined, or implied, but it's crucial to know what exactly it is.

  • Let's define n as the number of items in the hash map.

When n is the only variable under consideration then all of the methods are O(1). None of them loops over this.list, and so all operate in constant time with respect to the number of items in the hash map.

But, you object: there's a loop in hash(). How can it be O(1). Well, what is it looping over? Is it looping over the other items in the map? No. It's looping over input—but input.length is not a variable we're considering.

When people analyze hash map performance they normally ignore the length of the strings being passed in. If we do that, then with respect to n hash map performance is O(1).


If you do care about string lengths then you need to add another variable to the analysis.

  • Let's define n as the number of items in the hash map.
  • Let's define k as the length of the string being read/written.

The hash function is O(k) since it loops over the input string in linear time. Therefore, get() and set() are also O(k).

Why don't we care about k normally? Why do people only talk about n? It's because k is a factor when analyzing the hash function's performance, but when we're analyzing how well the hash map performs we don't really care about how quickly the hash function runs. We want to know how well the hash map itself is doing, and none of its code is directly impacted by k. Only hash() is, and hash() is not a part of the hash map, it's merely an input to it.

Upvotes: 4

Related Questions