javamultithreadingconcurrencyconcurrenthashmap

Reputation: 9803

ConcurrentHashMap putIfAbsent : atomicity when followed by a get() call

I wanted to discuss a specific use I have of a concurrent map to sense check my logic...

If I used ConcurrentHashMap, I can do the familar

private final ConcurrentHashMap<K, V> map = new ConcurrentHashMap<K, V>();

public V getExampleOne(K key) {
    map.putIfAbsent(key, new Object());
    return map.get(key);
}

but I realise that a race condition exists whereby if I remove the item from the map between the putIfAbsent and the get, the method above would return something that no longer exists in the collection. This may or may not be fine, but lets assume that for my use case, it's not ok.

What I'd really like is to have the whole thing atomic. So,

public V getExampleTwo(K key) {
    return map.putIfAbsent(key, new Object());
}

but as this expands out to

if (!map.containsKey(key))
   return map.put(key, value);     [1]
return map.get(key);

which for line [1] will return null for first usage (ie, map.put will return the previous value, which for first time use is null).

I can't have it return null in this instance

Which leaves me with something like;

public V getExampleThree(K key) {
    Object object = new Object();
    V value = locks.putIfAbsent(key, object);
    if (value == null)
        return object;
    return value;
}

So, finally, my question; how do the examples above differ in semantics?. Does getExampleThree ensure atomicity like getExampleTwo but avoid the null return correctly? Are there other problems with getExampleThree?

I was hoping for a bit of discussion around the choices. I realise I could use a non ConcurrentHashMap and synchronize around clients calling my get method and a method to remove from the map but that seems to defeat the purpose (non blocking nature) of the ConcurrentHashMap. Is that my only choice to keep the data accurate?

I guess that's a bit part of why you'd choose ConcurrentHashMap; that its visible/up-to-date/acurrate at the point you interact with it, but there may be an impact further down the line if old data is going to be a problem...

Upvotes: 5

Answers (4)

OldCurmudgeon

Reputation: 65841

I think you will find the trick is to assume you will be non-atomic and handle it.

I am not really clear what you are looking for. Let me know if this is off at a tangent and I'll modify.

Perhaps you are looking for something like:

private final ConcurrentHashMap<String, Object> map = new ConcurrentHashMap();

/*
 * Guaranteed to return the object replaced.
 * 
 * Note that by the time this method returns another thread 
 * may have already replaced the object again.
 * 
 * All we can guarantee here is that the return value WAS 
 * associated with the key in the map at the time the entry was 
 * replaced.
 * 
 * A null return implies that either a null object was associated
 * with the specified key or there was no entry in the map for 
 * the specified key wih 'null' as it's 'value'.
 */
public Object consistentReplace ( String key, Object newValue ) {
  Object oldValue = map.get(key);
  while ( !map.replace(key, oldValue, newValue) ) {
    // Failed to replace because someone got in there before me.
    oldValue = map.get(key);
  }
  return oldValue;
}

Upvotes: 0

Tudor

Reputation: 62459

Why not simply use the first version and synchronize the method?

public synchronized V getExampleOne(K key) {
    map.putIfAbsent(key, new Object());
    return map.get(key);
}

While it won't provide you maximum parallelism, it's also true that you only have two operations and getExampleThree, while correct, is less readable and less easy to understand for someone else reading your code.

Upvotes: 0

Peter Lawrey

Reputation: 533660

It sounds like you are trying to create a global lock object for a key.

Instead of deleting an entry with the possibility of have it re-created almost immediately, I would only delete the entry when you pretty sure its not needed ever again.

Otherwise, if you are using this for a lock, you can have two thread locking on different objects for the same key.

If its not ok, you can busy loop it.

public V getExampleOne(K key) {
    for(Object o = null, ret = null; (ret = map.get(key)) == null; )
        map.putIfAbsent(key, o == null ? o = new Object() : o);
    return ret;
}

it can still be removed or replaced as soon as the loop exists so its effectively much the same as.

public V getExampleThree(K key) {
    Object o = new Object();
    map.putIfAbsent(key, o);
    Object ret = map.get(key);
    return ret == null ? o : ret;
}

So, finally, my question; how do the examples above differ in semantics?.

The difference is only the obvious.

Does getExampleThree ensure atomicity like getExampleTwo but avoid the null return correctly?

Yes.

Are there other problems with getExampleThree?

Only if you believe the very next call might not give you a different value (if you believe it can be removed in another thread)

Upvotes: 5

dmeister

Reputation: 35634

The methods have different semantics:

getExampleOne is not atomic.
getExampleTwo returns null if the new object was inserted into the map. This differs from the behavior of getExampleOne, but it is atomic.
getExampleThree is probably what you want. It is atomic and it return the object that is in the map after the point in time of the putIfAbsent call. But it was problem when nulls are valid values in your application. The null return value is then ambiguous.

However, depending of the situation it might not be the actual object at the point in time when you use the return value. You then need explicit locking.

Upvotes: 3

ConcurrentHashMap putIfAbsent : atomicity when followed by a get() call

Answers (4)

Related Questions