Stefan
Stefan

Reputation: 1059

What happens when I add duplicate element into HashSet? Is old element overwritten or not?

Say I want to add duplicate element in HashSet. We know that is not allowed. So I came across 2, let's say theories.

Under 'PROPERTIES, number 2' of this link, it says:

HashSet does not allow duplicate elements. If you try to insert a duplicate element, older element will be overwritten

But when I check docs that are provided to me in IDE, in method add(), it states:

* If this set already contains the element, the call leaves the set
* unchanged and returns {@code false}.

So does it overwrite(replace) old element or it leaves the set unchanged and returns false? :) Am I crazy of they say exactly the opposite?

Upvotes: 2

Views: 964

Answers (4)

Zerin
Zerin

Reputation: 11

The IDE says "unchanged", which includes two cases.

  1. Overwrite the old element, so keep "unchanged".
  2. Do not modify, so keep "unchanged".

To determine which case is correct, we need to look for the answer in the source code of the add method in HashSet.

First, the add method calls the put method in HashMap:

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

Then, the put method calls the putVal method:

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1)
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { 
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    return null;
}

The key part is:

if (e != null) { 
    V oldValue = e.value;
    if (!onlyIfAbsent || oldValue == null)
        e.value = value;
    return oldValue;
}

Firstly, the code will check whether the found matching node e is null. If it is not null, it means that there is already a node in the hash table whose key is the same as the key of the newly inserted node, and the value of this node needs to be updated.

Then, the code stores the old value of the same node in the oldValue variable.

Next, the code checks whether it needs to update the value of the node. If onlyIfAbsent is false or the old value is null, it means that the value of the node needs to be updated. Since the put method passes the onlyIfAbsent parameter as false, it is here that a new value overwrites the old value, which proves that the first case is correct (overwrite the old element, so keep "unchanged").

Upvotes: 1

Stephen C
Stephen C

Reputation: 719307

The IDE is displaying an excerpt from the official specification (the javadocs) for the Set.add method.

The IDE is correct. Trust the official documentation, not some "pretty" website.


But how could people that made that site make such a big mistake then?

Always remember that the primary motivation for websites like the one you found is to earn money from your page views. They often pay more attention to "search engine optimization" than they do the quality of their material.


But the problem is those "random" sites sometimes make 'prettier' explanation of some concepts. Official docs are sometimes too hard to follow for me as a beginner.

So which is preferable? Something that easy to read, but wrong? Or something that is accurate?

In this case, you cannot argue that the official document is hard to understand. You yourself were able to see that the contradiction between the text taken from the official documentation and the 3rd-party website.

My advice us to always try to read the Official javadocs first, and always believe it ahead of any other sources ... including StackOverflow answers!! The only thing that is more definitive1 is the OpenJDK source code.


1 - And even that is debatable. On the one hand, the code determines what actually happens. On the other hand, the code may change from one version to another. So relying on behavior that is not specified in the javadocs may result in portability issues.

Upvotes: 4

Andres
Andres

Reputation: 10727

Execute the program below and you'll see the duplicate elements are not allowed:

import java.util.HashSet;

public class Test {
    public static class Dummy{
         String s1;
         String s2;
        public Dummy(String s1, String s2) {
            super();
            this.s1 = s1;
            this.s2 = s2;
        }
        @Override
        public int hashCode() {
            final int prime = 31;
            int result = 1;
            result = prime * result + ((s1 == null) ? 0 : s1.hashCode());
            return result;
        }
        @Override
        public boolean equals(Object obj) {
            if (this == obj)
                return true;
            if (obj == null)
                return false;
            if (getClass() != obj.getClass())
                return false;
            Dummy other = (Dummy) obj;
            if (s1 == null) {
                if (other.s1 != null)
                    return false;
            } else if (!s1.equals(other.s1))
                return false;
            return true;
        }
        @Override
        public String toString() {
            return "Dummy [s1=" + s1 + ", s2=" + s2 + "]";
        }
    }
    
    
    public static void main(String[] args) {
        HashSet<Object>hashSet =new HashSet<>();
        Object o1 = new Dummy("a","b");
        Object o2 = new Dummy("a","c");
        System.out.println(o1.equals(o2));
        hashSet.add(o1);
        System.out.println(hashSet);
        hashSet.add(o2);
        System.out.println(hashSet);
    }

}

Upvotes: 1

Nowhere Man
Nowhere Man

Reputation: 19565

Any the theories can be verified by looking at the source code.

For example, for JDK 8, HashSet::add implemented as follows:

/**
     * Adds the specified element to this set if it is not already present.
     * More formally, adds the specified element <tt>e</tt> to this set if
     * this set contains no element <tt>e2</tt> such that
     * <tt>(e==null&nbsp;?&nbsp;e2==null&nbsp;:&nbsp;e.equals(e2))</tt>.
     * If this set already contains the element, the call leaves the set
     * unchanged and returns <tt>false</tt>.
     *
     * @param e element to be added to this set
     * @return <tt>true</tt> if this set did not already contain the specified
     * element
     */
    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

Here map is an instance of HashMap, so we should look at HashMap::put:

 /**
     * Associates the specified value with the specified key in this map.
     * If the map previously contained a mapping for the key, the old
     * value is replaced.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>.
     *         (A <tt>null</tt> return can also indicate that the map
     *         previously associated <tt>null</tt> with <tt>key</tt>.)
     */
    public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

and implementation of putVal contains this code:

            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }

Upvotes: 1

Related Questions