user67081
user67081

Reputation: 339

Boost Threading Conceptualization / Questions

I've got a function that is typically run 50 times (to run 50 simulations). Usually this is done sequentially single threaded but I'd like to speed things up using multiple threads. The threads don't need to access each others memory or data so I don't think racing is an issue. Essentially the thread should just complete its task, and return to main thats it's finished, also returning a double value.

First of all, looking through all the boost documentation and examples has really convoluted me and I'm not sure what I'm looking for anymore. boost::thread ? boost future? Could someone give an example of what is applicable in my case. Additionally, I don't understand how to specify how many threads to run, is it more like I would run 50 threads and the OS handles when to execute them?

Upvotes: 2

Views: 974

Answers (2)

Jonathan Wakely
Jonathan Wakely

Reputation: 171253

Read the Boost.Thread Futures docs for an idea of using futures and async to achieve this. It also shows how to do it manually (the hard way) using thread objects.

Given this serial code:

double run_sim(Data*);

int main()
{
  const unsigned ntasks = 50;

  double results[ntasks];
  Data data[ntasks];

  for (unsigned i=0; i<ntasks; ++i)
    results[i] = run_sim(data[i]);
}

A naive parallel version would be:

#define BOOST_THREAD_PROVIDES_FUTURE
#include <boost/thread/future.hpp>
#include <boost/bind.hpp>

double run_task(Data*);

int main()
{
  const unsigned nsim = 50;

  Data data[nsim];
  boost::future<int> futures[nsim];

  for (unsigned i=0; i<nsim; ++i)
    futures[i] = boost::async(boost::bind(&run_sim, &data[i]));

  double results[nsim];
  for (unsigned i=0; i<nsim; ++i)
    results[i] = futures[i].get();
}

Because boost::async doesn't yet support deferred functions every async call will create a new thread, so this will spawn 50 thread at once. This might perform quite badly, so you could split it up into smaller blocks:

#define BOOST_THREAD_PROVIDES_FUTURE
#include <boost/thread/future.hpp>
#include <boost/thread/thread.hpp>
#include <boost/bind.hpp>

double run_sim(Data*);

int main()
{
  const unsigned nsim = 50;
  unsigned nprocs = boost::thread::hardware_concurrency();
  if (nprocs == 0)
    nprocs = 2;   // cannot determine number of cores, let's say 2

  Data data[nsim];   
  boost::future<int> futures[nsim];
  double results[nsim];

  for (unsigned i=0; i<nsim; ++i)
  {
    if ( ((i+1) % nprocs) != 0 )
      futures[i] = boost::async(boost::bind(&run_sim, &data[i]));
    else
      results[i] = run_sim(&data[i]);
  }

  for (unsigned i=0; i<nsim; ++i)
    if ( ((i+1) % nprocs) != 0 )
      results[i] = futures[i].get();
}

If hardware_concurrency() returns 4, this will create three new threads then call run_sim synchronously in the main() thread, then create another three new threads then call run_sim synchronously. This will prevent 50 threads all being created at once, as the main thread stops to do some of the work, which will allow some of the other threads to complete.

The code above requires quite a recent version of Boost, it's slightly easier using Standard C++ if you can use C++11:

#include <future>

double run_sim(Data*);

int main()
{
  const unsigned nsim = 50;
  Data data[nsim];

  std::future<int> futures[nsim];
  double results[nsim];

  unsigned nprocs = std::thread::hardware_concurrency();
  if (nprocs == 0)
    nprocs = 2;

  for (unsigned i=0; i<nsim; ++i)
  {
    if ( ((i+1) % nprocs) != 0 )
      futures[i] = std::async(boost::launch::async, &run_sim, &data[i]);
    else
      results[i] = run_sim(&data[i]);
  }

  for (unsigned i=0; i<nsim; ++i)
    if ( ((i+1) % nprocs) != 0 )
      results[i] = futures[i].get();
}

Upvotes: 1

Ulrich Eckhardt
Ulrich Eckhardt

Reputation: 17413

If your code is completely CPU-bound (no network/disk IO), then you would benefit from starting as many background threads as you have CPUs. Use Boost's hardware_concurrency() function to determine that number and/or allow the user to set it. Just starting a bunch of threads is not helpful, as that will increase the overhead caused by creating, switching and terminating threads.

The code starting the threads is a simple loop, followed by another loop to wait for the thread's completion. You can also use the thread_group class for that. If the number of jobs is not known and can't be distributed on thread startup, consider using a thread pool where you just start a sensible number of threads and then give them jobs while they come up.

Upvotes: 4

Related Questions