Reputation: 1473
I am looking at migrating from Dictionary to ConcurrentDictionary for a multi thread environment.
Specific to my use case, a kvp would typically be <string, List<T>>
Upvotes: 0
Views: 653
Reputation: 43515
The ConcurrentDictionary<TKey,TValue>
collection is surprisingly difficult to master. The pitfalls that are waiting to trap the unwary are numerous and subtle. Here are some of them:
ConcurrentDictionary<TKey,TValue>
blesses everything it contains with thread-safety. That's not true. If the TValue
is a mutable class, and is allowed to be mutated by multiple threads, it can be corrupted just as easily as if it wasn't contained in the dictionary.ConcurrentDictionary<TKey,TValue>
with patterns familiar from the Dictionary<TKey,TValue>
. Race conditions can trivially emerge. For example if (dict.Contains(x)) list = dict[x]
is wrong. In a multi-threaded environment it is entirely possible that the key x will be removed between the dict.Contains(x)
and the list = dict[x]
, resulting in a KeyNotFoundException
. The ConcurrentDictionary<TKey,TValue>
is equipped with special atomic APIs that should be used instead of the previous chatty check-then-act pattern.Count == 0
for checking if the dictionary is empty. The Count
property is very cheap for a Dictionary<TKey,TValue>
, and very expensive for a ConcurrentDictionary<TKey,TValue>
. The correct property to use is the IsEmpty
.AddOrUpdate
method can be safely used for updating a mutable TValue
object. This is not a correct assumption. The "Update" in the name of the method means "update the dictionary, by replacing an existing value with a new value". It doesn't mean "modify an existing value".ConcurrentDictionary<TKey,TValue>
will yield the entries that were stored in the dictionary at the point in time that the enumeration started. That's not true. The enumerator does not maintain a snapshot of the dictionary. The behavior of the enumerator is not documented precisely. It's not even guaranteed that a single enumeration of a ConcurrentDictionary<TKey,TValue>
will yield unique keys. In case you want to do an enumeration with snapshot semantics you must first take a snapshot explicitly with the (expensive) ToArray
method, and then enumerate the snapshot. You might even consider switching to an ImmutableDictionary<TKey,TValue>
, which is exceptionally good at providing these semantics.ConcurrentDictionary<TKey,TValue>
s interfaces is safe. This is not the case. For example the ToArray
method is safe because it's a native method of the class. The ToList
is not safe because it is a LINQ extension method on the IEnumerable<KeyValuePair<TKey,TValue>>
interface. This method internally first calls the Count
property of the ICollection<KeyValuePair<TKey,TValue>>
interface, and then the CopyTo
of the same interface. In a multi-threaded environment the Count
obtained by the first operation might not be compatible with the second operation, resulting in either an ArgumentException
, or a list that contains empty elements at the end.In conclusion, migrating from a Dictionary<TKey,TValue>
to a ConcurrentDictionary<TKey,TValue>
is not trivial. In many scenarios sticking with the Dictionary<TKey,TValue>
and adding synchronization around it might be an easier (and safer) path to thread-safety. IMHO the ConcurrentDictionary<TKey,TValue>
should be considered more as a performance-optimization over a synchronized Dictionary<TKey,TValue>
, than as the tool of choice when a dictionary is needed in a multi-threading scenario.
Upvotes: 3
Reputation: 142038
What do I need to look out for?
Depends on what you are trying to achieve :)
How do I manage reading key and values in different threads?
How do I manage updating key and values in different threads?
How do I manage adding/removing key and values in different threads?
Those should be handled by the dictionary itself in thread safe manner. With several caveats:
Functions accepting factories like ConcurrentDictionary<TKey,TValue>.GetOrAdd(TKey, Func<TKey,TValue>)
are not thread safe in terms of factory invocation (i.e. dictionary does not guarantee that the factory would be invoked only one time if multiple threads try to get or add the item, for example). Quote from the docs:
All these operations are atomic and are thread-safe with regards to all other operations on the
ConcurrentDictionary<TKey,TValue>
class. The only exceptions are the methods that accept a delegate, that is,AddOrUpdate
andGetOrAdd
. For modifications and write operations to the dictionary,ConcurrentDictionary<TKey,TValue>
uses fine-grained locking to ensure thread safety. (Read operations on the dictionary are performed in a lock-free manner.) However, delegates for these methods are called outside the locks to avoid the problems that can arise from executing unknown code under a lock. Therefore, the code executed by these delegates is not subject to the atomicity of the operation.
In your particular case value - List<T>
is not thread safe itself so while dictionary operations will be thread safe (with the exception from the previous point), mutating operations with value itself - will not, consider using something like ConcurrentBag
or switching to IReadOnlyDictionary
.
Personally I would be cautions working with concurrent dictionary via explicitly implemented interfaces like IDictionary<TKey, TValue>
and/or indexer (can lead to race conditions in read-update-write scenarios). From the docs:
Thread Safety
All public and protected members of
ConcurrentDictionary<TKey,TValue>
are thread-safe and may be used concurrently from multiple threads. However, members accessed through one of the interfaces theConcurrentDictionary<TKey,TValue>
implements, including extension methods, are not guaranteed to be thread safe and may need to be synchronized by the caller.
Upvotes: 3