GDub
GDub

Reputation: 638

Why doesn't multi-threading offer speedup?

I've noticed that using this simple example multi-threading almost always takes longer. I'm just testing it out in this code that i made. I'm using it on a 24 core processor. It seems that it works best using 2 threads and 3 or more threads is worst than using 1.

#include <thread>
#include <mutex>
#include <condition_variable>
#include <iostream>
using namespace std;
mutex total;
mutex coutLock;

mutex order;
long long sum=1000000000;
long long mysum=0;

const int threads=3;
long long x;

void dowork(int x,int threads) {
    long long temp=0;
    for(long long i=x*sum/threads;i<((x+1)*sum/threads);i++) {
        temp+=i;
    }

    total.lock();
    mysum+=temp;
    total.unlock(); 
}

int main() {
    thread * pool[threads];
    for(x=0;x<threads;x++) {
        thread *mine=new thread(dowork,x,threads);
        pool[x]=mine;
    }

    for(x=0;x<threads;x++) {
        pool[x]->join();
    }

    cout<<"My sum is: "<<mysum<<endl;
}

Upvotes: 2

Views: 159

Answers (2)

yoh2
yoh2

Reputation: 161

The loop in dowork() can be reduced into O(1) code calculating following equation:

temp = (b - a + 1) * a + (b - a) * (b - a + 1) / 2
       where a = x * sum / threads, b = (x + 1) * sum / threads - 1

For instance, clang++ 3.5.1 actually generates such code. In that case, unfortunately, the amount of calculation is proportional to the number of threads.

Upvotes: 2

Martin Perry
Martin Perry

Reputation: 9527

Your code is too simple, that compiler probably do some optimalization in single core run (like auto-vectorization).

Create new thread is also somehow an expensive operation and single thread can finish even before your threads has been created. Common practice in programs is to create some thread pool and then use threads from this pool. They dont need to be allocated again and using them is therefore faster in runtime. But this is not meant for such a simple app like this.

Upvotes: 1

Related Questions