Determining Optimum Thread Count

Question

So, as part of a school assignment, we are being asked to determine what our optimum thread count is for our personal computers by constructing a toy program.

To start, we are to create a task that takes between 20 and 30 seconds to run. I chose to do a coin toss simulation, where the total number of heads and tails are accumulated and then displayed. On my machine, 300,000,000 tosses on one thread ended up at 25 seconds. After that, I went to 2 threads, then 4, then 8, 16, 32, and, just for fun, 100. Here are the results:

* Thread Tosses per thread time(seconds) * ------------------------------------------ * 1 300,000,000 25 * 2 150,000,000 13 * 4 75,000,000 13 * 8 37,500,000 13 * 16 18,750,000 14 * 32 9,375,000 14 * 100 3,000,000 14

And here is the code I'm using:

void toss()
{
    int heads = 0, tails = 0;
    default_random_engine gen;
    uniform_int_distribution dist(0,1);
    int max =3000000;                          //tosses per thread
    for(int x = 0; x < max; ++x){(dist(gen))?++heads:++tails;}
    cout<thr;
    time_t st, fin;
    st = time(0);

    for(int i = 0;i < 100;++i){thr.push_back(thread(toss));} //thread count
    for(auto& thread: thr){thread.join();}

    fin = time(0);
    cout<



Now for the main question:

Past a certain point, I would've expected there to be a considerable decline in computing speed as more threads were added, but the results don't seem to show that. 

Is there something fundamentally wrong with my code that would yield these sorts of results, or is this behavior considered normal? I'm very new to multi-threading, so I have a feeling it's the former....

Thanks!

EDIT: I am running this on a macbook with a 2.16 GHz Core 2 Duo (T7400) processor

kmdreko · Accepted Answer

Your results seem very normal to me. While thread creation has a cost, its not that much (especially compared to the per-second granularity of your tests). An extra 100 thread creations, destructions, and possible context-switches isn't going to change your timing by more than a few milliseconds I bet.

Running on my Intel i7-4790 @ 3.60 GHz I get these numbers:

threads - seconds
-----------------
1       -  6.021
2       -  3.205
4       -  1.825
8       -  1.062
16      -  1.128
32      -  1.138
100     -  1.213
1000    -  2.312
10000   - 23.319

It takes many, many more threads to get to the point at which the extra threads make a noticeable difference. Only when I get to 1,000 threads do I see that the thread-management has made a significant difference and at 10,000 it dwarfs the loop (the loop is only doing 30,000 tosses at that point).

As towards your assignment, it should be fairly straightforward to see that the optimal number of threads for your system should be the same as the available threads that can be executed at once. There's not any processing power left to execute another thread until one is either done or yielded, which doesn't help you finish faster. And, any less threads and you aren't using all available resources. My CPU has 8 threads and the chart reflects that.

Determining Optimum Thread Count

Answers (2)

Related Questions