Achilles
Achilles

Reputation: 1129

Multiple threads updating different items of a shared Dictionary with locks on keys

I have a Dictionary<string, List<MyObject>> and I need to run some resource intensive operations on List<MyObject>. I'm trying to figure out if I can have one thread per key of the Dictionary doing the resource intensive tasks so that each thread updates its key's List. In other words, multiple threads simultaneously updating different items in the dictionary?

Please consider the following simplified pseudo code -

public void MyMethod() {
    //The myDict object needs to be shared by all threads.
    Dictionary<string, List<MyObject>> myDict = new Dictionary<string, List<MyObject>>();
    //GetKeyValue() may return the same key multiple times
    foreach(var kv in GetKeyValue()) { 
        if(myDict.ContainsKey(kv.Key) { myDict[kv.Key].Add(kv.Value); }
        else { myDict.Add(kv.Key, kv.Value); }
        Task.Factory.StartNew(() => { RunSubsetSum(kv.Key, myDict); });
    }
}
//Resource intensive method
public void RunSubsetSum(string key, Dictionary<string, List<MyObject>> myDict)  { 
    //Lock on key so that no two threads run for the same key
    lock(key){
        foreach(var valueToRemove in GetRemovableObjs()) 
            myDict[kv.Key].Remove(valueToRemove);
    }
}

Basically, the idea is that -

  1. No two threads run for the same key at the same time - Will lock(key) make them queue (run sequentially)?
  2. All running threads can independently update the same Dictionary - I expect multiple threads to be able to update different items in the same Dictionary simultaneously.

I tried the above approach but results seem to be inconsistent. I think it is because MyMethod() updates the Dictionary for a key for which RunSubsetSum() is already running but not sure how to lock on the key in MyMethod() without interrupting the loop for other keys. I wonder if C# provides simpler solution to this problem. Any thoughts?

Note: I'm considering creating a Dictionary so that I can keep track of which keys are currently being worked upon and updating MyMethod() to buffer the keys until threads finish, but I want to avoid adding this if I can to avoid overcomplicating the logic.

Upvotes: 3

Views: 2656

Answers (3)

Servy
Servy

Reputation: 203802

You shouldn't ever lock on a string. You're just opening yourself up for a world of hurt, mostly centering around string interning. Each time you use a string literal that is semantically identical to another string literal, they'll have the same reference (unless you turn it off), which means if any of your dictionary keys end up being string literals, then some other code somewhere else in the application domain that has nothing to do with your code could end up locking on the same value. This could end up either resulting in deadlocks, or the two segments waiting at times that they don't actually need to wait.

You should only ever lock on an object that you can be certain only the one type managing the synchronization can ever have access to. Doing this will mean that you can always look at just this one class to analyze the synchronization going on, and that you don't need to worry about what's going on in the rest of the application to determine the synchronization logic of this class.

Fortunately, you already do have an object that corresponds to each key, and that you never expose outside of this class, the List. You don't need to have a separate dictionary of objects to lock on, you can just use the List.

Another problem that you have is that you're fetching the value of the dictionary in your worker thread, which isn't safe, as you're modifying it in another thread, although this is trivially solved by simply fetching the value of the dictionary before starting the new thread, rather than after, and simply passing the string and List into RunSubsetSum rather than the Dictionary.

You're also mutating the List objects from both the worker threads and the main thread, so you'll need to ensure that the caller locks around the list before using it, as well as the workers.

Upvotes: 5

Mike U
Mike U

Reputation: 729

Servy is correct about locking on the string. The simply solution is to create a private field just for locking:

object LockMe = new object();

public void SomeMethod()
{
    lock(LockMe)
    {
        <... do something here ...>
    }
}

The other issue is to keep in mind that there is a maximum number of allowed threads per application so, if you are creating a thread for each key in the Dictionary, you run the risk of reaching that maximum thread count.

You might want to rethink your threading model.

Upvotes: -1

Cinchoo
Cinchoo

Reputation: 6322

lock on string is not going to help sync'ing resource. Since string is immutable object. Everytime, string is passed to the function will lead to new string. More importantly both methods are touching the dictionary at the same time. Would advise to combine them both and call from however your want.

private readonly object _padLock = new object();
public void CallingMethod() {
   Task.Factory.StartNew(() => { MyMethod(); })
}

public void MyMethod() {
   lock (_padLock)
   {
      //The myDict object needs to be shared by all threads.
      Dictionary<string, List<MyObject>> myDict = new Dictionary<string, List<MyObject>>();
      //GetKeyValue() may return the same key multiple times
      foreach(var kv in GetKeyValue()) { 
        if(myDict.ContainsKey(kv.Key) { myDict[kv.Key].Add(kv.Value); }
        else { myDict.Add(kv.Key, kv.Value); }
        RunSubsetSum(kv.Key, myDict);
      }
   }
}
//Resource intensive method
public void RunSubsetSum(string key, Dictionary<string, List<MyObject>> myDict)  { 
    //Lock on key so that no two threads run for the same key
        foreach(var valueToRemove in GetRemovableObjs()) 
            myDict[kv.Key].Remove(valueToRemove);
}

Upvotes: 0

Related Questions