Reputation: 361
I am trying to add even and odd numbers using CPP threads and the code looks as follows
typedef unsigned long long ull;
ull EvenSum = 0;
ull OddSum = 0;
void find_Evensum(ull start, ull end)
{
for(ull i=start;i<=end;i++)
{
if((i&1) == 0)
{
EvenSum+=i;
}
}
}
void find_Oddsum(ull start, ull end)
{
for(ull i=start;i<=end;i++)
{
if((i&1) == 1)
{
OddSum+=i;
}
}
}
int main()
{
ull start = 0;
ull end = 1900000000;
auto start_time = high_resolution_clock::now();
#if 0
thread t1(find_Evensum, start, end);
thread t2(find_Oddsum, start, end);
t1.join();
t2.join();
#else
find_Evensum(start, end);
find_Oddsum(start, end);
#endif
auto stop_time = high_resolution_clock::now();
auto duration = duration_cast<microseconds>(stop_time - start_time);
cout<<"Even Sum:"<<EvenSum<<endl;
cout<<"Odd Sum:"<<OddSum<<endl;
cout<<"Time taken: "<<(duration.count()/1000000)<<endl;
return 0;
}
The code runs at 5 seconds for sequential code. The code runs at 6 seconds for Threaded code.Why is the thread application taking more time than sequential??
PC is i5 with 8 cores. I open system monitor on linux and see 2 CPUS 100% usage during thread, but still the execution is slow.
The same code runs at 9 seconds for sequential and 5 seconds for thread in another system which is correct because threaded code is faster.
Both are linux OS. Build Command : g++ -std=c++11 -pthread main.cpp
I donot understand why this is happening in one system.
Upvotes: 0
Views: 239
Reputation: 2949
Your code is suffering from false sharing. The two counter variables share the same cache line, so the two threads keep tripping over each other, consistently invalidating the cache line so the core executing the other thread has to reload it.
Prefix the second variable with alignas(64)
to enforce that it is put on a separate cache line.
Upvotes: 5