Brett
Brett

Reputation: 12007

C++ - Multithreading

I have the code:

for (int i = 0; i < (int)(kpts.size()); i++) {
    perform_operation(kpts1[i], *kpts2[i]);
}

where kpt1 and kpt2 are a std::vector<> types. The function perform_operation takes kpt1[i], performs an operation on it and stores it in kpt2[i].

It seems like I should be able to multithread this. Since each cycle of the for loop is independent of one another, then I should be able to run this parallely with as many processes as there are CPU cores, right?

I've seem several SO questions kinda answering this, but they don't really get at how to parallelize a simple for loop; and I'm not sure if reading the same kpt1 variable and writing to the same kpt2 variable is possible.

Or am I misunderstanding something? - is this not parallelizable?

I'd be happy if I could find a solution in C++ or C, but right now I am stuck.

Upvotes: 2

Views: 932

Answers (2)

Jay
Jay

Reputation: 14481

I believe you're asking can you operate on each element of the array in a separate thread?

You can. There are several considerations though.

As long as the separate operations don't impact each other it's a good candidate for parallelism.

As a practical matter standard on CPU threading is slow to setup and eats up a good amount of memory (pthread by default allocates 32 megabytes per thread for the stack). If the tasks are pretty intensive then you get back the setup overhead in time savings. If not then it's both harder to code, bigger, and slower than doing it in a straight forward way.

Intel TBB is one option. NVidia CUDA is another

Upvotes: 1

abligh
abligh

Reputation: 25179

Provided each perform_operation operates independently of each other, then ues, this is parallelizable.

Rather than simply calling perform_operation, start a new thread (with pthread_create). You will need to wrap the parameters in a single struct (could just be pointers to both arguments), and pass start_routine as a wrapper around perform_operation. That will create the relevant number of threads. Then in a second for loop use pthread_join to wait for the threads you have created to exit.

That's a rough outline. Obviously some error handling would be useful, and you might want each thread to perform a number of perform_operations serially, rather than one thread per item. But you should get the basic idea from the above.

Upvotes: 1

Related Questions