Boost: Creating objects and populating a vector with threads

Question

Using this boost asio based thread pool, in this case the class is named ThreadPool, I want to parallelize the population of a vector of type std::vector>, where T is a struct containing a vector of type std::vector whose content and size are dynamically determined after struct initialization.

Unfortunately, I am a newb at both c++ and multi threading, so my attempts at solving this problem have failed spectacularly. Here's an overly simplified sample program that times the non-threaded and threaded versions of the tasks. The threaded version's performance is horrendous...

#include "thread_pool.hpp"
#include 
#include 
#include 


using namespace boost;
using namespace std;


struct T {
  vector nums = {};
};


typedef boost::shared_ptr Tptr;
typedef vector TptrVector;


void create_T(const int i, TptrVector& v) {
  v[i] = Tptr(new T());
  T& t = *v[i].get();
  for (int i = 0; i < 100; i++) {
    t.nums.push_back(i);
  }
}


int main(int argc, char* argv[]) {
  clock_t begin, end;
  double elapsed;

  // define and parse program options

  if (argc != 3) {
    cout << argv[0] << "  " << endl;
    return 1;
  }
  int iterations = stoi(argv[1]),
      threads    = stoi(argv[2]);

  // create thread pool
  ThreadPool tp(threads);

  // non-threaded
  cout << "non-thread" << endl;
  begin = clock();

  TptrVector v(iterations);
  for (int i = 0; i < iterations; i++) {
    create_T(i, v);
  }

  end = clock();
  elapsed = double(end - begin) / CLOCKS_PER_SEC;
  cout << elapsed << " seconds" << endl;

  // threaded
  cout << "threaded" << endl;
  begin = clock();

  TptrVector v2(iterations);
  for (int i = 0; i < iterations; i++) {
    tp.submit(boost::bind(create_T, i, v2));
  }
  tp.stop();

  end = clock();
  elapsed = double(end - begin) / CLOCKS_PER_SEC;
  cout << elapsed << " seconds" << endl;

  return 0;
}

After doing some digging, I think the poor performance may be due to the threads vying for memory access, but my newb status if keeping me from exploiting this insight. Can you efficiently populate the pointer vector using multiple threads, ideally in a thread pool?

Andriy Tylychko · Accepted Answer

you haven't provided neither enough details or a Minimal, Complete, and Verifiable example, so expect lots of guessing.

createT is a "cheap" function. Scheduling a task and an overhead of its execution is much more expensive. It's why your performance is bad. To get a boost from parallelism you need to have proper work granularity and amount of work. Granularity means that each task (in your case one call to createT) should be big enough to pay for multithreading overhead. The simplest approach would be to group createT calls to get bigger tasks.

Boost: Creating objects and populating a vector with threads

Answers (1)

Related Questions