javajava.util.concurrentconcurrenthashmap

Reputation: 10405

Consequences of updating other key(s) in ConcurrentHashMap#computeIfAbsent

Javadoc from ConcurrentHashMap#computeIfAbsent says

The computation should be short and simple, and must not attempt to update any other mappings of this map.

But, from what I see, using remove() and clear() methods inside mappingFunction works fine. For example this

Key element = elements.computeIfAbsent(key, e -> {
    if (usages.size() == maxSize) {
        elements.remove(oldest);
    }
    return loader.load(key);
});

What bad consequences of using remove() method inside mappingFunction could be?

Upvotes: 10

Answers (3)

davidxxx

Reputation: 131456

The javadoc explains clearly the cause :

Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map.

You have not to forget that ConcurrentHashMap is designed to provide a way of using a thread safe Map without being as locking as it is the case for older thread safe Map classes as HashTable.
When a modification on the map occurs, it locks only the concerned mapping and not the whole map.

ConcurrentHashMap is a hash table supporting full concurrency of retrievals and high expected concurrency for updates.

computeIfAbsent() is a new method added in Java 8.
If badly used, that is, if in the body of computeIfAbsent() that already locks the mapping of the key passed to the method, you lock another key, you enter in a path where you may defeat the purpose of the ConcurrentHashMap as finally you will lock two mapping.
Imagine the problem if you lock more mapping inside computeIfAbsent() and that the method is not short at all. Concurrency access on the map would become slow.

So the javadoc of computeIfAbsent() stresses on this potential issue by reminding the principles of ConcurrentHashMap : keep it simple and fast.

Here is example code illustrating the issue.
Suppose we have an instance of ConcurrentHashMap<Integer, String>.

We will start two threads that use it :

The first thread : thread1 that invokes computeIfAbsent() with the key 1
The second thread : thread2 that invokes computeIfAbsent() with the key 2

thread1 executes a fast enough task but it doesn't follow advise of the computeIfAbsent() javadoc : it updates the key 2 in computeIfAbsent(), that is another mapping of which one used in the current context of the method (that is key 1).

thread2 executes a long enough task. It invokes computeIfAbsent() with the key 2 by following advises of the javadoc : it doesn't update any other mapping in the implementation of it.
To simulate the long task, we can use the Thread.sleep() method with 5000 as parameter.

For this specific situation, if thread2 starts before thread1, the invocation ofmap.put(2, someValue); in thread1 will be blocked while thread2 is not returned of computeIfAbsent() that locks the mapping of the key 2.

Finally, we get a ConcurrentHashMap instance that blocks the mapping of the key 2 during 5 seconds while computeIfAbsent() is invoked with the mapping of the key 1.
It is misleading, not effective and it goes against the ConcurrentHashMap intention and the computeIfAbsent() description which the intention is computing the value for the current key :

if the specified key is not already associated with a value, attempts to compute its value using the given mapping function and enters it into this map unless null

The sample code :

import java.util.concurrent.ConcurrentHashMap;

public class BlockingCallOfComputeIfAbsentWithConcurrentHashMap {

  public static void main(String[] args) throws InterruptedException {
    ConcurrentHashMap<Integer, String> map = new ConcurrentHashMap<>();

    Thread thread1 = new Thread() {
        @Override
        public void run() {
            map.computeIfAbsent(1, e -> {
                String valueForKey2 = map.get(2);
                System.out.println("thread1 : get() returns with value for key 2 = " + valueForKey2);
                String oldValueForKey2 = map.put(2, "newValue");
                System.out.println("thread1 : after put() returns, previous value for key 2 = " + oldValueForKey2);
                return map.get(2);
            });
        }
    };

    Thread thread2 = new Thread() {
        @Override
        public void run() {
          map.computeIfAbsent(2, e -> {
            try {
              Thread.sleep(5000);
            } catch (Exception e1) {
              e1.printStackTrace();
            }
            String value = "valueSetByThread2";
            System.out.println("thread2 : computeIfAbsent() returns with value for key 2 = " + value);
            return value;
          });
        }
    };

    thread2.start();
    Thread.sleep(1000);
    thread1.start();
  }
}

As output we always get :

thread1 : get() returns with value for key 2 = null

thread2 : computeIfAbsent() returns with value for key 2 = valueSetByThread2

thread1 : after put() returns, previous value for key 2 = valueSetByThread2

This is written fast as reads on the ConcurrentHashMap are not blocking :

thread1 : get() returns with value for key 2 = null

but this :

thread1 : after put() returns, previous value for key 2 = valueSetByThread2

is output only when thread2 is returned of computeIfAbsent().

Upvotes: 4

Andy Turner

Reputation: 140484

Such advice is a bit like the advice not to walk down the middle of a road. You can do it, and you may well not be hit by a car; you may also be able to get out of the way if you see the car coming.

However, you'd have been safer if you'd just stayed on the pavement (sidewalk) in the first place.

If an API document tells you not to do something, there is of course nothing stopping you from doing it. And you might try doing it, and find that there are no ill consequences, at least in the limited circumstances that you test. You can even dig in to find out the exact reasons why the advice is there; you can scrutinize the source code and prove that it is safe in your use case.

However, the implementors of the API are free to change the implementation, within the constraints of the contract described by the API documentation. They might make a change that stops your code working tomorrow, because they are under no obligation to preserve behavior that they explicitly warn against using.

So, to answer your question of what the bad consequences could be: literally anything (well, anything that completes normally or throws a RuntimeException); and you would not necessarily observe the same consequences over time or on different JVMs.

Stay on the pavement: don't do what the documentation tells you not to.

Upvotes: 2

Eran

Reputation: 393936

Here's an example of a bad consequence:

ConcurrentHashMap<Integer,String> cmap = new ConcurrentHashMap<> ();
cmap.computeIfAbsent (1, e-> {cmap.remove (1); return "x";});

This code causes a deadlock.

Upvotes: 5

Consequences of updating other key(s) in ConcurrentHashMap#computeIfAbsent

Answers (3)

Related Questions