Reputation: 687
In the JavaDoc for the size()
method in ConcurrentHashMap
it states:
"Bear in mind that the results of aggregate status methods including size
, isEmpty
, and containsValue
are typically useful only when a map is not undergoing concurrent updates in other threads."
I know that in most concurrent applications maps are essentially moving targets therefore it is not that common to need the value returned by size()
method to be guaranteed to be correct. But in my case I actually do need it to be correct(not a stale value).
Seen as I really do not want to lock the whole table for each size()
call, my question is why does ConcurrentHashMap
not just store the size of the table in an AtomicInteger
field, which gets updated as part of a put()
or remove()
operation? If it did then the write to the size field would always be atomic and (due to the fact that AtomicInteger
stores its value in a volatile field) then any reads from the size field would be return the most recently updated version of the field...
Or am I missing something here?
Upvotes: 2
Views: 1488
Reputation: 719229
Fact #1: Updating a single size
field has the potential to be a concurrency bottleneck, whether it is implemented as a volatile int
, an AtomicInteger
or something else.
Fact #2: In the vast majority of use-cases for ConcurrentHashMap
, the size()
method is not useful to the application because it only tells you the size at a particular instant. (If that. The javadoc does not actually specify what size()
will return in the face of concurrent updates. )
If we assume that size()
is unlikely to be called, then introducing a potential concurrency bottleneck into update operations in case it is called seems like a bad idea to me. I'm guessing that the author (Doug Lea) had similar thoughts.
Upvotes: 3
Reputation: 298389
As ruediste explained, the size information is pointless, if you have ongoing concurrent updates. You may get the old value, the new value, or an arbitrary in-between value in case of multiple updates.
This has nothing to do with the question whether the value is stored in an atomic integer or not. But the whole point about ConcurrentHashMap
is to allow concurrent updates.
“Concurrent” means there is no time relationship nor order of actions. When a thread attempts to insert a value while another one attempts to remove a value, there is a time relationship only if they are accessing the same key. In that case the removing thread can use the result to tell whether an insertion happened before the removal or not. In all other cases, the threads act independently.
When querying the size at the same time you may get the old size, the old size plus one or the old size minus one (in case the initial state wasn’t empty). If you get a value different from the old size, you may use it to deduce whether the insertion or the removal happened first, and come to a wrong conclusion. A different thread using containsKey
at the same time to detect if either key is present might perceive a different order of these actions. That’s what “no ordering” or “no time relationship” implies.
Maintaining an atomic integer size value wouldn’t change anything. There is still a point where a thread has modified the map’s content but not the size field yet. And there might be an arbitrary number of threads right at this point. But forcing all threads to update a single atomic integer would reduce concurrency drastically. While an atomic integer is faster than a synchronized
block, it’s still a kind of synchronization action which suffers from high contention. So it would reduce performance without solving any problem.
Upvotes: 3
Reputation: 2959
Yes, you are missing something. Imagine 10 threads performing updates, and the eleventh regularly determining the size of the map.
The numbers the 11th thread reads totally depends on the execution timing of the 10 other threads. Using an atomic integer would not help here.
Upvotes: 4