Reputation: 383
I'm running a very simple routine in C++ with openMP and measuring the elapsed time... the code goes at reads,
#include <iostream>
#include <math.h>
#include "timer.h"
#include <omp.h>
int main ()
{
double start,finish;
int i;
int n=8000;
double a[n];
double b[n];
double c[n];
GET_TIME(start);
#pragma omp parallel private(i,a) shared(b,c,n)
{
#pragma omp for
for (i=0; i<n-1; i++)
b[i] += (a[i] + a[i+1])/2;
#pragma omp for
for (i=0; i<n-1; i++)
c[i] += (a[i] + a[i+1])/2;
}
GET_TIME(finish);
std::cout<< "Elapsed time is" <<(finish-start)<<"seconds";
return 0;
}
Code with I'm compiling with the following bash script (observe that threads are defined in the environment variable OMP_NUM_THREADS=$n):
#!/bin/bash
clear
g++ -O3 -o test test.cpp -fopenmp
for n in $(seq 1 8); do
export OMP_NUM_THREADS=$n
./test
echo threads=$n
done
As a result, a general trend of decreasing the performance with increasing the number of threads is observed as follows: (Of course the numbers can change)...
Elapsed time is0.000161886secondsthreads=1
Elapsed time is0.00019002secondsthreads=2
Elapsed time is0.00226498secondsthreads=3
Elapsed time is0.000210047secondsthreads=4
Elapsed time is0.000212908secondsthreads=5
Elapsed time is0.00920105secondsthreads=6
Elapsed time is0.00937104secondsthreads=7
Elapsed time is0.000834942secondsthreads=8
Any suggestions for increasing the performance (instead of decreasing it)? Thank you very much!.
Upvotes: 0
Views: 284
Reputation: 665
You can do this instead, it will increase the operation done by each thread. This is to overcome the overhead needed to start a new thread by actually having the thread do some more work. Also, there is no need to declare the b, c or n as shared.
#pragma omp parallel private(i,a,b,c,n)
{
#pragma omp for schedule(static)
for (i=0; i<n-1; i++){
b[i] += (a[i] + a[i+1])/2;
c[i] += (a[i] + a[i+1])/2;}
}
Upvotes: 2