DON
DON

Reputation: 43

Regarding HashMap implementation in java

I was trying to do research on hashmap and came up with the following analysis:

https://stackoverflow.com/questions/11596549/how-does-javas-hashmap-work-internally/18492835#18492835

Q1 Can you guys show me a simple map where you can show the process..that how hashcode for the given key is calculated in detail by using this formula ..Calculate position hash % (arrayLength-1)) where element should be placed(bucket number), let say I have this hashMap

HashMap map=new HashMap();//HashMap key random order.
         map.put("Amit","Java");
         map.put("Saral","J2EE");

Q2 Sometimes it might happen that hashCodes for 2 different objects are the same. In this case 2 objects will be saved in one bucket and will be presented as LinkedList. The entry point is more recently added object. This object refers to other objest with next field and so one. Last entry refers to null. Can you guys show me this with real example..!!

.

"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7. how this is possible please explanin detail the whole calculation..?

Snapshots updated...

enter image description here

and the other image is ...

enter image description here

Upvotes: 5

Views: 5099

Answers (4)

It's Rammy
It's Rammy

Reputation: 1

import java.util.Arrays;
public class Test2 {
public static void main(String[] args) {
    Map<Integer, String> map = new Map<Integer, String>();
    map.put(1, "A");
    map.put(2, "B");
    map.put(3, "C");
    map.put(4, "D");
    map.put(5, "E");

    System.out.println("Iterate");
    for (int i = 0; i < map.size(); i++) {

        System.out.println(map.values()[i].getKey() + " : " + map.values()[i].getValue());
    }

    System.out.println("Get-> 3");
    System.out.println(map.get(3));

    System.out.println("Delete-> 3");
    map.delete(3);

    System.out.println("Iterate again");
    for (int i = 0; i < map.size(); i++) {

        System.out.println(map.values()[i].getKey() + " : " + map.values()[i].getValue());
    }
}

}

class Map<K, V> {

private int size;
private Entry<K, V>[] entries = new Entry[16];

public void put(K key, V value) {

    boolean flag = true;
    for (int i = 0; i < size; i++) {

        if (entries[i].getKey().equals(key)) {
            entries[i].setValue(value);
            flag = false;
            break;
        }
    }

    if (flag) {
        this.ensureCapacity();
        entries[size++] = new Entry<K, V>(key, value);
    }
}

public V get(K key) {

    V value = null;

    for (int i = 0; i < size; i++) {

        if (entries[i].getKey().equals(key)) {
            value = entries[i].getValue();
            break;
        }
    }
    return value;
}

public boolean delete(K key) {
    boolean flag = false;
    Entry<K, V>[] entry = new Entry[size];
    int j = 0;
    int total = size;
    for (int i = 0; i < total; i++) {

        if (!entries[i].getKey().equals(key)) {
            entry[j++] = entries[i];
        } else {
            flag = true;
            size--;
        }
    }

    entries = flag ? entry : entries;
    return flag;
}

public int size() {
    return size;
}

public Entry<K, V>[] values() {
    return entries;
}

private void ensureCapacity() {

    if (size == entries.length) {
        entries = Arrays.copyOf(entries, size * 2);
    }
}

@SuppressWarnings("hiding")
public class Entry<K, V> {

    private K key;
    private V value;

    public K getKey() {
        return key;
    }

    public V getValue() {
        return value;
    }

    public void setValue(V value) {
        this.value = value;
    }

    public Entry(K key, V value) {
        super();
        this.key = key;
        this.value = value;
    }

}
}

Upvotes: -1

Hot Licks
Hot Licks

Reputation: 47699

Understand that there are two basic requirements for a hash code:

  1. When the hash code is recalculated for a given object (that has not been changed internally in a way that would alter its identity) it must produce the same value as the previous calculation. Similarly, two "identical" objects must produce the same hash codes.
  2. When the hash code is calculated for two different objects (which are not considered "identical" from the standpoint of their internal content) there should be a high probability that the two hash codes would be different.

How these goals are accomplished is the subject of much interest to the math nerds who work on such things, but understanding the details is not at all important to understanding how hash tables work.

Upvotes: 0

Thomas Jungblut
Thomas Jungblut

Reputation: 20969

that how hashcode for the given key is calculated in detail by using this formula

In case of String this is calculated by String#hashCode(); which is implemented as follows:

 public int hashCode() {
    int h = hash;
        int len = count;
    if (h == 0 && len > 0) {
        int off = offset;
        char val[] = value;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
    }

Basically following the equation in the java doc

 hashcode = s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

One interesting thing to note on this implementation is that String actually caches its hash code. It can do this, because String is immutable.

If I calculate the hashcode of the String "Amit", it will yield to this integer:

System.out.println("Amit".hashCode());
>     2044535

Let's get through a simple put to a map, but first we have to determine how the map is built. The most interesting fact about a Java HashMap is that it always has 2^n buckets. So if you call it, the default number of buckets is 16, which is obviously 2^4.

Doing a put operation on this map, it will first get the hashcode of the key. There happens some fancy bit twiddeling on this hashcode to ensure that poor hash functions (especially those that do not differ in the lower bits) don't "overload" a single bucket.

The real function that is actually responsible for distributing your key to the buckets is the following:

 h & (length-1); // length is the current number of buckets, h the hashcode of the key

This only works for power of two bucket sizes, because it uses & to map the key to a bucket instead of a modulo.

"Amit" will be distributed to the 10th bucket, because of the bit twiddeling. If there were no bit twiddeling it would go to the 7th bucket, because 2044535 & 15 = 7.

Now that we have an index for it, we can find the bucket. If the bucket contains elements, we have to iterate over them and replace an equal entry if we find it. If none item has been found in the linked list we will just add it at the beginning of the linked list.

The next important thing in HashMap is the resizing, so if the actual size of the map is above over a threshold (determined by the current number of buckets and the loadfactor, in our case 16*0.75=12) it will resize the backing array. Resize is always 2 * the current number of buckets, which is guranteed to be a power of two to not break the function to find the buckets.

Since the number of buckets change, we have to rehash all the current entries in our table. This is quite costly, so if you know how many items there are, you should initialize the HashMap with that count so it does not have to resize the whole time.

Upvotes: 2

mishadoff
mishadoff

Reputation: 10789

Q1: look at hashCode() method implementation for String object

Q2: Create simple class and implement its hashCode() method as return 1. That means each your object with that class will have the same hashCode and therefore will be saved in the same bucket in HashMap.

Upvotes: 0

Related Questions