mark
mark

Reputation: 62836

Out of two ways to get/create items concurrently from/in ConcurrentHashMap, which one is better?

I have found myself using two slightly different approaches for getting/creating items from/in a ConcurrentHashMap and I wonder which one is better.

The first way:

public class Item {
  private boolean m_constructed;
  ...
  public void construct() {
    if (m_constructed) {
      return;
    }

    synchronized (this) {
      if (m_constructed) {
        return;
      }

      // Some heavy construction

      m_constructed = true;
    }
  }
}

ConcurrentHashMap<String, Item> m_items = new ConcurrentHashMap<String, Item>();

...
// The following code is executed concurrently by multiple threads:
public Item getOrCreateItem(String key) {
  Item newItem = new Item();                          // The constructor is empty
  Item item = m_items.putIfAbsent(key, newItem);
  if (item == null) {
    item = newItem;
  }
  item.construct();                                  // This is the real construction
  return item;
}

Please, do not comment on using this in synchronize (this). I am aware of the crappiness of using this as the lock object, but I am fine with it in this particular example.

The second way:

public class Item {
  private boolean m_constructed;
  ...
  public void construct() {
    // Some heavy construction

    m_constructed = true;
  }

  public void waitForConstruction() throws InterruptedException {
    while (!m_constructed) {
      Thread.sleep(50);
    }
  }
}

ConcurrentHashMap<String, Item> m_items = new ConcurrentHashMap<String, Item>();

...
// The following code is executed concurrently by multiple threads:
public Item getOrCreateItem(String key) {
  Item newItem = new Item();                          // The constructor is empty
  Item item = m_items.putIfAbsent(key, newItem);
  if (item == null) {
    item.construct();                                 // This is the real construction
    item = newItem;
  }
  item.waitForConstruction();
  return item;
}

I wonder if one way is more superior to the other. Any ideas?

EDIT

A few words on the context. The Item map is populated concurrently by multiple threads, all of which execute getOrCreateItem. No code tries to access the map in any other way. Once the population is over, the map is never modified and becomes open to read-only access. Hence no one can get a partially constructed Item instance outside the getOrCreateItem method.

EDIT2

Thanks for all the answers. I have adopted the first approach with the suggested fixes:

public class Item {
  private volatile boolean m_constructed;         // !!! using volatile
  ...
  public void construct() {
    if (m_constructed) {
      return;
    }

    synchronized (this) {
      if (m_constructed) {
        return;
      }

      // Some heavy construction

      m_constructed = true;
    }
  }
}

ConcurrentHashMap<String, Item> m_items = new ConcurrentHashMap<String, Item>();

...
// The following code is executed concurrently by multiple threads:
public Item getOrCreateItem(String key) {
  Item item = m_items.get(key);                         // !!! Checking the mainstream case first
  if (item == null) {
    Item newItem = new Item();                          // The constructor is empty
    item = m_items.putIfAbsent(key, newItem);
    if (item == null) {
      item = newItem;
    }
  }
  item.construct();                                  // This is the real construction
  return item;
}

Of course, I am acting under the assumption that no code accesses the map of items using any other way but the getOrCreateItem method. Which is true in my code.

Upvotes: 0

Views: 141

Answers (2)

sjlee
sjlee

Reputation: 7876

I think the first solution is not so bad. The m_constructed variable must be volatile for it to work correctly, and as John Vint suggested, calling ConcurrentMap.get() before doing putIfAbsent() is recommended.

I suspect that the most important call path is the steady state access (threads accessing the item that is already added to the map and is already constructed). In that case, with the first solution, you would do a ConcurrentHashMap.get() call and a volatile read (on m_constructed), which is not so bad.

The second solution is a poor one as it involves unnecessary busy spin loop. When it is converted to using a CountDownLatch per John Vint's suggestion, the steady state performance would be similar to the above: a ConcurrentHashMap.get() call and CountDownLatch.await() which should be similar to a uncontended volatile read. However, the only downside is that it adds more memory to Item.

Upvotes: 1

Eyal Schneider
Eyal Schneider

Reputation: 22446

First, note that none of your solutions is properly synchronized. The m_constructed flag must be made volatile for this to work. Otherwise you may encounter thread visibility problems, since your read access to this member isn't protected by the lock.

Anyway, the item class is too similar to the concept of Future. If you store Future implementation as map values (e.g. FutureTask), your problem is solved.

Upvotes: 3

Related Questions