Antreas Petrou
Antreas Petrou

Reputation: 51

Parallelization with OpenMP

I have this code:

 for (i=1; i<=imax-1; i++) {
        for (j=1; j<=jmax; j++) {
            /* only if both adjacent cells are fluid cells */
            if ((flag[i][j] & C_F) && (flag[i+1][j] & C_F)) {
                du2dx = ((u[i][j]+u[i+1][j])*(u[i][j]+u[i+1][j])+
                    gamma*fabs(u[i][j]+u[i+1][j])*(u[i][j]-u[i+1][j])-
                    (u[i-1][j]+u[i][j])*(u[i-1][j]+u[i][j])-
                    gamma*fabs(u[i-1][j]+u[i][j])*(u[i-1][j]-u[i][j]))
                    /(4.0*delx);
                duvdy = ((v[i][j]+v[i+1][j])*(u[i][j]+u[i][j+1])+
                    gamma*fabs(v[i][j]+v[i+1][j])*(u[i][j]-u[i][j+1])-
                    (v[i][j-1]+v[i+1][j-1])*(u[i][j-1]+u[i][j])-
                    gamma*fabs(v[i][j-1]+v[i+1][j-1])*(u[i][j-1]-u[i][j]))
                    /(4.0*dely);
                laplu = (u[i+1][j]-2.0*u[i][j]+u[i-1][j])/delx/delx+
                    (u[i][j+1]-2.0*u[i][j]+u[i][j-1])/dely/dely;

                f[i][j] = u[i][j]+del_t*(laplu/Re-du2dx-duvdy);
            }
            else {
                f[i][j] = u[i][j];
            }
        }
    }

and I want to create 4 threads. One to compute du2dx, duvdy, laplu and f[i][j]. Is there a way to do that with OpenMP?

Upvotes: 1

Views: 128

Answers (2)

Antonin GAVREL
Antonin GAVREL

Reputation: 11219

You can use

#pragma omp parallel for private(i)

Right before your initial for, as stated in this tutorial

NB: There are other very cool features, you might want to check.

Upvotes: 1

Jim Cownie
Jim Cownie

Reputation: 2859

You are right, @Andreas. By default in most implementations, omp parallel will create as many threads as are required to exploit the available parallelism in the machine. omp for will then distribute entire loop iterations to the threads (exactly how will depend on the schedule chosen). So they will not split individual statements from inside the loop and execute each as a separate parallel task.

But, OTOH, "Why do you want to limit yourself to four threads?" Is this code so unimportant that

  1. No one else will ever use it?
  2. You will throw it away before you get a better machine?

Note that what you say you want to do (execute each statement in a separate thread) makes no sense, since there is a dependency between the statements. The final statement

f[i][j] = u[i][j]+del_t*(laplu/Re-du2dx-duvdy);

depends on all of the preceding ones.

So, your best bet is to go with a simple OpenMP parallelisation, though probably something like this will work best

#pragma omp parallel for collapse(2), \
     schedule(nonmonotonic:dynamic),\
     shared(flag,u,gamma,Re,imax,jmax),\
     private(i,j,du2dx,duvdy,laplu)
for (i=1; i<=imax-1; i++) {
        for (j=1; j<=jmax; j++) {
            /* only if both adjacent cells are fluid cells */
            if ((flag[i][j] & C_F) && (flag[i+1][j] & C_F)) {
                du2dx = ((u[i][j]+u[i+1][j])*(u[i][j]+u[i+1][j])+
                    gamma*fabs(u[i][j]+u[i+1][j])*(u[i][j]-u[i+1][j])-
                    (u[i-1][j]+u[i][j])*(u[i-1][j]+u[i][j])-
                    gamma*fabs(u[i-1][j]+u[i][j])*(u[i-1][j]-u[i][j]))
                    /(4.0*delx);
                duvdy = ((v[i][j]+v[i+1][j])*(u[i][j]+u[i][j+1])+
                    gamma*fabs(v[i][j]+v[i+1][j])*(u[i][j]-u[i][j+1])-
                    (v[i][j-1]+v[i+1][j-1])*(u[i][j-1]+u[i][j])-
                    gamma*fabs(v[i][j-1]+v[i+1][j-1])*(u[i][j-1]-u[i][j]))
                    /(4.0*dely);
                laplu = (u[i+1][j]-2.0*u[i][j]+u[i-1][j])/delx/delx+
                    (u[i][j+1]-2.0*u[i][j]+u[i][j-1])/dely/dely;

                f[i][j] = u[i][j]+del_t*(laplu/Re-du2dx-duvdy);
            }
            else {
                f[i][j] = u[i][j];
            }
        }
    }

Upvotes: 4

Related Questions