This may be a strange question, but it is based on some results I get, using Java Map - is element retrieval speed greater in case of a HashMap , when the map is smaller? I have some part of code that uses containsKey and get(key) methods of a HashMap , and it seems that runs faster if number of elements in the Map is smaller? Is that so? My knowledge is that HashMap uses some hash function to access to certain field of a map, and there are versions in which that field is a reference to a linked list (because some keys can map to same value), or to other fields in the map, when implemented fully statically. Is this correct - speed can be greater if Map has less elements? I need to extend my question, with a concrete example. I have 2 cases, in both the total number of elements is same. In first case, I have 10 HashMaps, I'm not aware how elements are distributed. Time of execution of that part of algorithm is 141ms. In second case, I have 25 HashMaps, same total number of elements. Time of execution of same algorithm is 69ms. In both cases, I have a for loop that goes through each of the HashMaps, tries to find same elements, and to get elements if present. Can it be that the execution time is smaller, because individual search inside HashMap is smaller, so is there sum? I know that this is very strange, but is something like this somehow possible, or am I doing something wrong? Map(Integer,Double) is considered. It is hard to tell what is the distribution of elements, since it is actually an implementation of KMeans clustering algorithm, and the elements are representations of cluster centroids. That means that they will mostly depend on the initialization of the algorithm. And the total number of elements will not mostly be the same, but I have tried to simplify the problem, sorry if that was misleading.

Reputation: 3864

HashMap speed greater for smaller maps

This may be a strange question, but it is based on some results I get, using Java Map - is element retrieval speed greater in case of a HashMap, when the map is smaller?

I have some part of code that uses containsKey and get(key) methods of a HashMap, and it seems that runs faster if number of elements in the Map is smaller? Is that so?

My knowledge is that HashMap uses some hash function to access to certain field of a map, and there are versions in which that field is a reference to a linked list (because some keys can map to same value), or to other fields in the map, when implemented fully statically.

Is this correct - speed can be greater if Map has less elements?

I need to extend my question, with a concrete example.

I have 2 cases, in both the total number of elements is same.

In first case, I have 10 HashMaps, I'm not aware how elements are distributed. Time of execution of that part of algorithm is 141ms.
In second case, I have 25 HashMaps, same total number of elements. Time of execution of same algorithm is 69ms.

In both cases, I have a for loop that goes through each of the HashMaps, tries to find same elements, and to get elements if present.

Can it be that the execution time is smaller, because individual search inside HashMap is smaller, so is there sum?

I know that this is very strange, but is something like this somehow possible, or am I doing something wrong?

Map(Integer,Double) is considered. It is hard to tell what is the distribution of elements, since it is actually an implementation of KMeans clustering algorithm, and the elements are representations of cluster centroids. That means that they will mostly depend on the initialization of the algorithm. And the total number of elements will not mostly be the same, but I have tried to simplify the problem, sorry if that was misleading.

Upvotes: 2

Answers (2)

Zim-Zam O'Pootertoot

Reputation: 18148

Does your timing take into account only the cost of get / containsKey, or are you also performing puts in the timed code section? If so, and if you're using the default constructor (initial capacity 16, load factor 0.75) then the larger hash tables are going to need to resize themselves more often than will the smaller hash tables. Like Joop Eggen says in his answer, try playing around with the initial capacity in the constructor, e.g. if you know that you have N elements then set the initial capacity to N / number_of_hash_tables or something along those lines - this ought to result in the smaller and larger hash tables having sufficient capacity that they won't need to be resized

Upvotes: 0

Joop Eggen

Reputation: 109597

The number of collisions is decisive for a slow down.

Assume an array of some size, the hash code modulo the size then points to an index where the object is put. Two objects with the same index collide.

Having a large capacity (array size) with respect to number of elements helps.

With HashMap there are overloaded constructors with extra settings.

public HashMap(int initialCapacity,
               float loadFactor)

Constructs an empty HashMap with the specified initial capacity and load factor.

You might experiment with that.

For a specific key class used with a HashMap, having a good hashCode can help too. Hash codes are a separate mathematical field.

Of course using less memory helps on the processor / physical memory level, but I doubt an influence in this case.

Upvotes: 1

HashMap speed greater for smaller maps

Answers (2)

Related Questions