Reputation: 51
I have this code:
for (i=1; i<=imax-1; i++) {
for (j=1; j<=jmax; j++) {
/* only if both adjacent cells are fluid cells */
if ((flag[i][j] & C_F) && (flag[i+1][j] & C_F)) {
du2dx = ((u[i][j]+u[i+1][j])*(u[i][j]+u[i+1][j])+
gamma*fabs(u[i][j]+u[i+1][j])*(u[i][j]-u[i+1][j])-
(u[i-1][j]+u[i][j])*(u[i-1][j]+u[i][j])-
gamma*fabs(u[i-1][j]+u[i][j])*(u[i-1][j]-u[i][j]))
/(4.0*delx);
duvdy = ((v[i][j]+v[i+1][j])*(u[i][j]+u[i][j+1])+
gamma*fabs(v[i][j]+v[i+1][j])*(u[i][j]-u[i][j+1])-
(v[i][j-1]+v[i+1][j-1])*(u[i][j-1]+u[i][j])-
gamma*fabs(v[i][j-1]+v[i+1][j-1])*(u[i][j-1]-u[i][j]))
/(4.0*dely);
laplu = (u[i+1][j]-2.0*u[i][j]+u[i-1][j])/delx/delx+
(u[i][j+1]-2.0*u[i][j]+u[i][j-1])/dely/dely;
f[i][j] = u[i][j]+del_t*(laplu/Re-du2dx-duvdy);
}
else {
f[i][j] = u[i][j];
}
}
}
and I want to create 4 threads. One to compute du2dx, duvdy, laplu and f[i][j]. Is there a way to do that with OpenMP?
Upvotes: 1
Views: 128
Reputation: 11219
You can use
#pragma omp parallel for private(i)
Right before your initial for, as stated in this tutorial
NB: There are other very cool features, you might want to check.
Upvotes: 1
Reputation: 2859
You are right, @Andreas. By default in most implementations, omp parallel
will create as many threads as are required to exploit the available parallelism in the machine. omp for
will then distribute entire loop iterations to the threads (exactly how will depend on the schedule chosen). So they will not split individual statements from inside the loop and execute each as a separate parallel task.
But, OTOH, "Why do you want to limit yourself to four threads?" Is this code so unimportant that
Note that what you say you want to do (execute each statement in a separate thread) makes no sense, since there is a dependency between the statements. The final statement
f[i][j] = u[i][j]+del_t*(laplu/Re-du2dx-duvdy);
depends on all of the preceding ones.
So, your best bet is to go with a simple OpenMP parallelisation, though probably something like this will work best
#pragma omp parallel for collapse(2), \
schedule(nonmonotonic:dynamic),\
shared(flag,u,gamma,Re,imax,jmax),\
private(i,j,du2dx,duvdy,laplu)
for (i=1; i<=imax-1; i++) {
for (j=1; j<=jmax; j++) {
/* only if both adjacent cells are fluid cells */
if ((flag[i][j] & C_F) && (flag[i+1][j] & C_F)) {
du2dx = ((u[i][j]+u[i+1][j])*(u[i][j]+u[i+1][j])+
gamma*fabs(u[i][j]+u[i+1][j])*(u[i][j]-u[i+1][j])-
(u[i-1][j]+u[i][j])*(u[i-1][j]+u[i][j])-
gamma*fabs(u[i-1][j]+u[i][j])*(u[i-1][j]-u[i][j]))
/(4.0*delx);
duvdy = ((v[i][j]+v[i+1][j])*(u[i][j]+u[i][j+1])+
gamma*fabs(v[i][j]+v[i+1][j])*(u[i][j]-u[i][j+1])-
(v[i][j-1]+v[i+1][j-1])*(u[i][j-1]+u[i][j])-
gamma*fabs(v[i][j-1]+v[i+1][j-1])*(u[i][j-1]-u[i][j]))
/(4.0*dely);
laplu = (u[i+1][j]-2.0*u[i][j]+u[i-1][j])/delx/delx+
(u[i][j+1]-2.0*u[i][j]+u[i][j-1])/dely/dely;
f[i][j] = u[i][j]+del_t*(laplu/Re-du2dx-duvdy);
}
else {
f[i][j] = u[i][j];
}
}
}
Upvotes: 4