Is it more efficient to mutex lock a variable multiple times in a code-block, or just lock the whole code-block?

Question

If we have the following code example:

if (letsDoThis) //bool
{
    sharedVar++; // This is shared across other threads.
          :
    sharedVar++;
          :
    sharedVar++;
          :
    sharedVar++;
          :
    sharedVar++;
}

Where each : is like ~10 lines of code (no slow function calls or anything). Would it be faster to write code such that you lock a mutex around the whole "if" block (well, the contents of the if block), or to lock and unlock over each individual sharedVar usage?

If it is a "it depends" type question, then which approach (maybe as a rule of thumb) is better to start with?

Finally how can you determine which of the two runs faster on your system? - would a trace tool really show you useful data in a meaningful way?

MikeMB · Accepted Answer

This depends (among all the other tings that impact performance) on

How many threads and cores you have (the min of both is relevant)
How much time threads spend in that part of the code (and other parts that lock the mutex)

From the perspective of a single thread, multiple locks and unlocks mean additional work and are - especially under congestion - rather expensive/timeconsuming and it will probably lead to more cache ping pong between the cores. So it will slow down the performance of a single thread. However, if multiple locks and unlocks reduce the total amount of time a thread holds the mutex in each iteration, then this means, there can be more parallelism in your program and the overal performance scales better with the number of threads and CPU cores.

Both effects might be negligable, both effects might be significant and if the code isn't in the hotpath of your program, it might not matter in the first place. I think the only thing, you can do to determine what is better in your case is to just run both variants and measure overall throughput. If you don't see a difference, I'd probably go with a single lock and unlock (i.e. with a std::lock_guard) for simplicity.

The general question you should ask however is, if you really need so much synchronization between the threads and if you have to synchronize multiple times: If it is ok that other threads don't see intermediate values of the shared state (which you can't guarantee anyway if they don't wait for each other), then why don't you combine all the operations on the shared state and make it a single operation at the end of the block?

And of course, if your shared state is indeed only a single integer, then you should just use atomics and get rid of the mutex altogether.

Is it more efficient to mutex lock a variable multiple times in a code-block, or just lock the whole code-block?

Answers (2)

Related Questions