rahuljain1311
rahuljain1311

Reputation: 2160

Is there a better way to use C++ concurrency than below?

I am trying to learn the basics of threading, mutex, etc. Following the documentation and examples from here. In the below code I am getting the output as expected. Questions:

  1. Want to confirm if there is any pitfall that I am missing? How can we improve the code below?
  2. On which line did my thread tried to take the mutex or is waiting for mutex? (Is it after line 11 or on line 11?)
  3. Is it ok to include cv.notify_all(); in the thread code?
#include<mutex>
#include<iostream>
#include<thread>
#include<condition_variable>

std::mutex mtx;
std::condition_variable cv;
int currentlyRequired = 1;

void work(int id){
    std::unique_lock<std::mutex> lock(mtx); // Line 11
    while(currentlyRequired != id){
        cv.wait(lock);
    }
    std::cout<<"Thread # "<<id<<"\n";
    currentlyRequired++;
    cv.notify_all();
}

int main(){
    std::thread threads[10];
    
    for(int i=0; i<10; i++){
        threads[i] = std::thread(work, i+1);
    }
    
    for(int i=0; i<10; i++){
        threads[i].join();
    }
    return 0;
}

/* Current Output:
Thread # 1
Thread # 2
Thread # 3
Thread # 4
Thread # 5
Thread # 6
Thread # 7
Thread # 8
Thread # 9
Thread # 10
Program ended with exit code: 0
*/

Upvotes: 1

Views: 481

Answers (1)

Persixty
Persixty

Reputation: 8589

First and foremost I recommend you read the documentation here: cppreference.com

It's a little more discursive about the key points you need to conform to to use a condition variable for inter-thread waiting and notification.

I think your code measures up. Your code obtains the lock in line 11 as you think. Other constructors of unique_lock will adopt a mutex previously locked (by the current thread) or not locked (by current thread) and lock it when requested (that would be lock.lock(); here).

What you have is right. You check the relevant data holding the lock. Wait then unlocks it (through the unique_lock) and awaits notification. When notified it stops waiting, locks it again and loops to check the condition. Eventually the condition is true and (still holding the lock) each thread continues on to do its 'work'. The 'waiting' side looks correct. The 'notifying' side also looks correct. The data for the condition must be modified holding the mutex to ensure the correct synchronisation of checking the data and going into the waiting state that the condition-variable manages.

You correctly notify_all(). Even though logically (in this example) only one thread needs to wake up there's no way of picking it out to be the target of notify_one().

So all the threads wake up (if suspended), check their condition and exactly one of them identifies its turn and runs.

Common wisdom (and my experience) is that it's better (and valid) to notify_all() not holding the lock because the waiting threads wake up to block (on the lock). But I'm told some platforms run better notifying under the lock. Welcome to the world of platform dependence...

So in terms of implementing a condition-variable I think that's valid and pretty much textbook.

It's good to see the join() as well. I have a bugbear about coders not joining to threads. It's one of those that at small scale and load you get away with but when the application scales and experiences high-load can start to cause problems and confusion.

The problem I have with what you've done has no parallelism.

What you've achieved is a daisy-chain. The very intention is to ensure only one thread is 'doing work' at once and they do so in strict order.

To take advantage of concurrency we want to maximise parallelism - the number of parallel running threads doing 'the work' (i.e. not the housekeeping of inter-thread communication) and your code is literally (because you did it right!) guaranteed to ensure there is never more than one thread running and the code is due to housekeeping guaranteed to be slightly slower than a single-threaded application (which would be a for loop!).

So it gets top-marks for program correctness but no marks for being useful!

Both the examples on cplusplus.con and cppreference.com are little better in my view.

The best introductory example is some kind of producer consumer model. That's closer to an actually useful pattern.

Try something as simple as one producer thread is counting up through the integers and multiple consumer threads square them and output them. A key point will be that if you're doing it right the squares won't in general come out in sequential order.

These examples are like teaching recursion with factorial. Sure it's recursive but recursion is a terrible way to calculate factorial! Sure your multi-threading (the other examples) are valid but they're outright contrived to do nothing useful in parallel!

Please don't take that as a criticism. As a 'tooth cutting' exercise what you've got is top-class. The next task is to do something that uses concurrency usefully!.

The problem is we don't want condition-variables! By definition they make threads wait for each other and that reduces parallelism. We need them and they do their job. But a design that reduces the amount of mutual waiting (either on locks or conditions) is usually better because the enemy here is waiting, blocking or (worst) spinning instead of suspended waiting/blocking.

Here's a better design for your task. Better because it entirely avoids the condition-variable!!

#include<mutex>
#include<iostream>
#include<thread>

std::mutex mtx;
int currentlyRequired = 1;

void work(int id){
    std::lock_guard<decltype(mtx)> guard(mtx);
    const auto old{currentlyRequired++};
    std::cout<<"Thread # "<<id<<" "<< old << "-> " <<currentlyRequired<< "\n";
}

int main(){
    std::thread threads[10];
    
    for(int i=0; i<10; i++){
        threads[i] = std::thread(work, i+1);
    }
    
    for(int i=0; i<10; i++){
        threads[i].join();
    }
    std::cout << "Final result: " << currentlyRequired << std::endl;
    return 0;
}

Specimen output:

Thread # 7 1-> 2
Thread # 8 2-> 3
Thread # 9 3-> 4
Thread # 10 4-> 5
Thread # 6 5-> 6
Thread # 5 6-> 7
Thread # 4 7-> 8
Thread # 3 8-> 9
Thread # 2 9-> 10
Thread # 1 10-> 11
Final result: 11

Which thread does which increment will vary. But final result is always 11. It's still no good because there's still no parallelism because the work is all done under a single lock. That's why we need a more interesting task.

Have fun.

Upvotes: 1

Related Questions