Not getting the expected speedup using OpenMP on non-trivial calculations

Question

I'm trying to learn OpenMP to parallelize a part of my code and I'm trying to figure out why it's not faster when using 2 threads instead of 1. Here's a minimal working example of the code:

#include 
#include 

using namespace std;

class My_class
{
    public :

        // Constructor
        My_class(int nuIterations) 
            : prVar_(0),
              nuIters_(nuIterations)
        {} // Empty

        // Do something expensive involving the class' private vars
        void do_calculations()
        {
            for (int i=0;ido_calculations();
        }
#pragma omp section
        {
            test_object2->do_calculations();
        }
    }// End of parallel sections
    // Print results
    double end = omp_get_wtime();
    cout<<"Res 1 : "<getResult()<getResult()<



Compiling and running this using g++ myomp.cpp -O0 -std=c++11 -fopenmp gives the following execution time for 1 and 2 threads:


1 thread : 11.5 seconds
2 threads: 13.2 seconds


Is there some way I can speed this up for 2 threads? 
I am running this on a 4-core Intel i7-4600U and Ubuntu.

EDIT: Changed most of the post such that it follows the guidlines.

Zulan · Accepted Answer

There are two effects here:

Cache line contention: You have two very small objects that are allocated in dynamic memory. If they end up in the same cache line (usually 64 byte), the threads that want to update prVar_ will both compete for the level 1 cache, because they need exclusive (write) access. You should have observed this randomly: sometimes it is significantly faster / slower depending on the memory location. Try to print the pointer addresses and divide them by 64. To address this issue, you need to pad / align the memory.
You have a huge load imbalance. One task is simply computing twice as much work, so even under idealized conditions, you will only achieve a speedup of 1.5.

Not getting the expected speedup using OpenMP on non-trivial calculations

Answers (1)

Related Questions