Speed up issues with 4 threads on quadcore system using OpenMP

Question

I have a speed up problem using 4 threads on quadcore system using OpenMP. With 2 threads the efficiency is close to 1 but with 4 threads it reduces to half that is the running time is more or less same as when running the code using 2 threads. I searched on OpenMP forum and i find similar issue before which is because of Inter turbo boost technology. Please refer to this post http://openmp.org/forum/viewtopic.php?f=3&t=1289&start=0&hilit=intel+turbo+boost

So i have tried to disable turbo boost on all the 4 processors of my machine but couldn't get rid of the problem.

I took the benchmark code from above link only.

I have a DELL laptop and my harware/OS information summary is as follows:

OS : Linux3.0.0.12-generic , Ubuntu
KDE SC Version : 4.7.1

Processor: Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz

Please let me know what could be other possible problems that is not allowing me to speed up using 4 threads/cores. As an additional info. i have checked that all the 4 threads are running on different cores.

Looking forward to your answers.

Code:

#include 
#include 
#include 


double estimate_pi(double radius, int nsteps){

   int i;
   double h=2*radius/nsteps;
   double sum=0;
   for (i=1;i %f
",radius,sum);
   return sum;


}

int main(int argc, char* argv[]){


   double ser_est,par_est;
   long int radii_range;
   if (argc>1) radii_range=atoi(argv[1]);
   else radii_range=500;   

   int nthreads;
   if (argc>2) nthreads=atoi(argv[2]);
   else nthreads=omp_get_num_procs();

   printf("Estimating Pi by averaging %ld estimates.
",radii_range);
   printf("OpenMP says there are %d processors available.
",omp_get_num_procs());

   int r;
   double start, stop, serial_time, par_time;



   par_est=0;
   double tmp=0;
   ser_est=0;
   start=omp_get_wtime();
   for (r=1;r<=radii_range;r++){
      tmp=estimate_pi(r,1e6);
      ser_est+=tmp;
   }
   stop=omp_get_wtime();
   serial_time=stop-start;
   ser_est=ser_est/radii_range;

   omp_set_num_threads(nthreads);
   start=omp_get_wtime();
   #pragma omp parallel for private(r,tmp) reduction(+:par_est)
   for (r=1;r<=radii_range;r++){
      tmp=estimate_pi(r,1e6);
      par_est+=tmp;
   }
   stop=omp_get_wtime();
   par_time=stop-start;
   par_est=par_est/radii_range;

   printf("Serial Estimate: %f
Parallel Estimate:%f

",ser_est,par_est);
   printf("Serial Time: %f
Parallel Time:%f
Number of Threads: %d
Speedup: %f
Efficiency: %f
",serial_time,par_time,nthreads,serial_time/par_time, serial_time/par_time/nthreads);


}

Speed up issues with 4 threads on quadcore system using OpenMP

Answers (1)

Related Questions