Which is the proper way to use locking inside of Parallel.ForEach?

I would like to learn the best way to use locking in a Parallel.ForEach. Should I lock the whole code block inside of the iteration or should I only lock the object which I want to use as multi-thread safe before I do any process?

for example:

Parallel.ForEach(list, item =>
        {
            lock (secondList)
            {
                //consider other processes works in here

                if (item.Active)
                    secondList.Add(item);
            }
        });

Parallel.ForEach(list, item =>
        {
            //consider other processes works in here

            if (item.Active)
            {
                lock (secondList)
                    secondList.Add(item);
            }
        });

Upvotes: 1

Answers (4)

Theodor Zoulias

Reputation: 43996

The best way is the second one:

if (item.Active)
{
    lock (secondList)
        secondList.Add(item);
}

The general advice is: do the least amount of work while holding the lock, and release it at the earliest opportunity. Don't do work inside the lock that can be done outside the lock.

Checking the item.Active does not require synchronization, so putting it inside lock is not needed. The more work that you put inside the lock, the more likely that another thread will be blocked while trying to acquire the same locker object. This is called contention, and slows down you application. You can get contention statistics by querying the Monitor.LockContentionCount property before and after the parallel execution:

Gets the number of times there was contention when trying to take the monitor's lock.

The smallest the delta (after - before), the better.

Upvotes: 0

Cem Önal

Reputation: 21

For example that kind of usings:

public static T CastTo<T>(this ArrayOfKeyValueOfstringstringKeyValueOfstringstring[] item)
    {
        var obj = Activator.CreateInstance(typeof(T), true);
        var padlock = new object();

        Parallel.ForEach(typeof(T).GetProperties(BindingFlags.Instance | BindingFlags.Public), prop =>
        {
            lock (padlock)
            {
                if (!prop.TryGetAttribute<GldnhrnFieldAttribute>(out var fieldAttribute))
                    return;

                var code = fieldAttribute?.Code;

                if (string.IsNullOrEmpty(code)) return;

                SetPropertyValue(item, obj, prop);
            }
        });

        return (T)obj;
    }

as you can see I want to cast my data to class over here. same question with different code block, should I lock all code block or should I lock only before calling SetPropertyValue method?

 public static T CastTo<T>(this ArrayOfKeyValueOfstringstringKeyValueOfstringstring[] item)
    {
        var obj = Activator.CreateInstance(typeof(T), true);
        var padlock = new object();

        Parallel.ForEach(typeof(T).GetProperties(BindingFlags.Instance | BindingFlags.Public), prop =>
        {
            if (!prop.TryGetAttribute<GldnhrnFieldAttribute>(out var fieldAttribute))
                return;

            var code = fieldAttribute?.Code;

            if (string.IsNullOrEmpty(code)) return;

            lock (padlock)
                SetPropertyValue(item, obj, prop);
        });

        return (T)obj;
    }

Upvotes: 1

Damien_The_Unbeliever

Reputation: 239824

Parallel.ForEach is a way to try to get more parallelism into your code. lock tends to reduce parallelism in your code¹. As such, it's rarely correct to want to combine them².

As Olegl suggested, Concurrent Collections could be one way to go here to avoid the lock.

Another interesting approach would be to use PLINQ here instead of Parallel.ForEach. It's 2019 already, what's interesting about writing another loop?

This would do something like this instead:

secondList.AddRange(list.AsParallel.Where(item =>
{
    //consider other processes works in here

    return item.Active;
});

This allows you to keep your non-thread-safe secondList collection but still not worry about locks - because it's your own existing thread calling AddRange that ends up consuming that IEnumerable<T> that PLINQ offers; so only that one thread is adding items to the collection.

PLINQ tries to tune buffering options but may not achieve a good enough job, depending on the size of the input list and how many threads it chooses to use. If you're unhappy with any speedup from it (or it doesn't achieve any), try playing with the WithXxx methods it offers before writing it off.

If I had to pick between your two examples (assuming that they're both otherwise correct), I'd choose option 2, because it does less work whilst holding a lock that is being hotly contested by all of the other workers.

¹Unless you know that all of the locks that will be requested are fine-grained enough that no two parallel threads will attempt to acquire the same lock. But if we know that, why are we using locks again?

²And so I'll go out on a limb and say it's "always" incorrect to combine them when all parallel locks are on the same lock object, unless there's significant processing happening in parallel outside of the lock.

Upvotes: 2

OlegI

Reputation: 6020

If your application concurrent(parallelism is one of the types of concurrency) and you want to use a thread-safe collection, there is no reason to lock collections on your own. There are bunch of concurrent collections provided by Microsoft which exist in System.Collections.Concurrent Thread-safe Collections

Upvotes: 3

Which is the proper way to use locking inside of Parallel.ForEach?

Answers (4)

Related Questions