pookie
pookie

Reputation: 4142

Update entries of a dictionary in parallel?

No idea if this is possible, but rather than iterate over a dictionary and modify entries based on some condition, sequentially, I was wondering if it is possible to do this in parallel?

For example, rather than:

Dictionary<int, byte> dict = new Dictionary<int, byte>();
for (int i = 0; i < dict.Count; i++)
{
    dict[i] = 255;
}

I'd like something like:

Dictionary<int, byte> dict = new Dictionary<int, byte>();
dict.Parallel(x => x, <condition>, <function_to_apply>);

I realise that in order to build the indices for modifying the dict, we would need to iterate and build a list of ints... but I was wondering if there was some sneaky way to do this that would be both faster and more concise than the first example.

I could of course iterate through the dict and for each entry, spawn a new thread and run some code, return the value and build a new, updated dictionary, but that seems really overkill.

The reason I'm curious is that the <function_to_apply> might be expensive.

Upvotes: 1

Views: 1833

Answers (2)

Sasha
Sasha

Reputation: 8860

First of all, I mostly agree with others recommending ConcurrentDictionary<> - it is designed to be thread-safe.

But if you are adventurous coder ;) and performance it super-critical for you, you could sometimes try doing what you (I suppose) is trying to do in case no new keys are added and no keys are removed from dictionary during your parallel manipulations:

int keysNumber = 1000000;
Dictionary<int, string> d = Enumerable.Range(1, keysNumber)
                                .ToDictionary(x => x, x => (string)null);

Parallel.For(1, keysNumber + 1, k => { d[k] = "Value" + k; /*Some complex logic might go here*/ });

To verify data consistency after these operations you can add:

Debug.Assert(d.Count == keysNumber);
for (int i = 1; i <= keysNumber; i++)
{
    Debug.Assert(d[i] == "Value" + i);
}
Console.WriteLine("Successful");

WHY IT WORKS:

Basically we have created dictionary in advance from SINGLE main thread and then popullated it in parallel. What allows us to do that is that current Dictionary implementation (Microsoft does not guarantee that, but most likely won't ever change) defines it's structure solely on keys, and values are just assigned to corresponding cells. Since each key is being assigned a new value from single thread we do not have race condition, and since navigating the hashtable concurrently does not alter it, everything works fine.

But you should be really careful with such code and have very good reasons not to use ConcurrentDictionary.

PS: My main idea is not even a "hack" of using Dicrionary concurrently, but to draw attention that not always every data structure need to be concurrent. I saw ConcurrentDictionary<int, ConcurrentStack<...>>, while each stack object in dictionary could be accessed only from single thread and that is an overkill and doesn't make your performance better. Just keep in mind what are you affecting and what can go wrong with multithreading scenarios.

Upvotes: 1

nvoigt
nvoigt

Reputation: 77364

I could of course iterate through the dict and for each entry, spawn a new thread and run some code, return the value and build a new, updated dictionary, but that seems really overkill.

Assuming you don't need the dictionary while it's rebuilt it's not that much:

var newDictionary = dictionary.AsParallel()
                              .Select(kvp => 
                                     /* do whatever here as long as
                                       it works with the local kvp variable
                                       and not the original dict */
                                     new
                                     {
                                          Key = kvp.Key,
                                          NewValue = function_to_apply(kvp.Key, kvp.Value)
                                     })
                                     .ToDictionary(x => x.Key,
                                                   x => x.NewValue);

Then lock whatever sync object you need and swap the new and old dictionaries.

Upvotes: 4

Related Questions