C++ OpenMP directives for a parallel for loop?

Question

I am trying OpenMP on a particular code snippet. Not sure if the snippet needs a revamp, perhaps it is set up too rigidly for sequential implementation. Anyway here is the (pseudo-)code that I'm trying to parallelize:

#pragma omp parallel for private(id, local_info, current_local_cell_id, local_subdomain_size) shared(cells, current_global_cell_id, global_id)
for(id = 0; id < grid_size; ++id) {
   local_info = cells.get_local_subdomain_info(id);
   local_subdomain_size = local_info.size();
   ...do other stuff...
   do {
      current_local_cell_id = cells.get_subdomain_cell_id(id);
      global_id.set(id, current_global_cell_id + current_local_cell_id);
   } while(id < local_subdomain_size && ++id);
   current_global_cell_id += local_subdomain_size;
}

This makes complete sense (after staring at it for some time) in a sequential sense, which also might mean that it needs to be re-written for OpenMP. My concern is that current_local_cell_id and local_subdomain_size are private, but current_global_cell_id and global_id are shared.

Hence the statement current_global_cell_id += local_subdomain_size after the inner loop:

do {
  ...
} while(...)
current_global_cell_id += local_subdomain_size;

might lead to errors in the OpenMP setting, I suspect. I would greatly appreciate if any of the OpenMP experts out there can provide some pointers on any of the special OMP directives I can use to make minimum changes to the code but still avail of OpenMP for such a type of for loop.

sehe · Accepted Answer

I'm not sure I understand your code. However, I think you really want some kind of parallel accumulation.

You could use a pattern like

 size_t total = 0;
 #pragma omp parallel for shared(total) reduction (+:total)
 for (int i=0; i



On a related note, when you use gcc you can just use the __gnu_parallel::accumulate drop-in replacement for std::accumulate, which does exactly the same. See Chapter 18. Parallel Mode

 size_t total = __gnu_parallel::accumulate(c.begin(), c.end(), 0, &myvalue_accum);


You can even compile with -D_GLIBCXX_PARALLEL which will make all use of std algorithms automatically parallellized if possible. Don't use that unless you know what you're doing! Frequently, performance just suffers and the chance of introducing bugs due to unexpected parallelism is real

C++ OpenMP directives for a parallel for loop?

Answers (2)

Related Questions