Reputation: 1
I am learning how to use OpenMP in C program. I noticed that "#pragma omp atomic" will increase the runtime even if the number of threads is 1 while updating a 1d array. Here is my code:
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <mpi.h>
#include <omp.h>
double fixwork(int a, int n) //n==L
{
int j;
double s, x, y;
double t = 0;
for (j = 0; j < n; j++)
{
s = 1.0 * j * a;
x = (1.0 - cos(s)) / 2.0;
y = 0.31415926 * x;
t += y;
}
return t;
}
int main(int argc, char* argv[])
{
int n = 100000;
int p = 1;
int L = 2;
int q = 100;
int g = 7;
int i, j, k;
double v;
int np, rank;
MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &np);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
double* u = (double*)calloc(n * g, sizeof(double));
double* w = (double*)calloc(n * g, sizeof(double));
double omptime1 = -MPI_Wtime();
#pragma omp parallel for private(k, j, v) num_threads(p)
for (i = 0; i < n; i++)
{
k = i * (int)ceil(1.0 * (i % q) / q);
for (j = 0; j < g; j++)
{
v = fixwork(i * g + j, L);
#pragma omp atomic
u[k] += v;
}
}
omptime1 += MPI_Wtime();
printf("\npragma time = %f", omptime1);
MPI_Finalize();
return 0;
}
I complied this code by:
mpiicc -qopenmp atomictest.c -o atomic
With 1 openmp thread and 1 mpi process, the observed ratio of time(use atomic)/time(no atomic) is ~ 1.28 (n=1e6), ~1.07 (n=1e7), and even larger for smaller n. It says the atomic directive itself has cost more time to operate. What is the reason for such performance? What is the difference between the machine operations of "omp atomic" and "c++ atomic"? Thanks
Upvotes: 0
Views: 228