Fopa Léon Constantin
Fopa Léon Constantin

Reputation: 12363

openMP How to get better work balance?

I'm working on an program which has to do a computation foobar upon many files, foobar can be done either in parallel or in sequential on one file, the program will receive many files (which can be of different size !) and apply the computation foobar either in parallel or sequentially on each of them with a specified number of threads.

Here is how the program is launch on 8 files with three threads.

./program 3 file1 file2 file3 file4 file5 file6 file7 file8

The default scheduling that i've implement is to affect in parallel one thread on each file to do the computation (that's how my program works now !).

Edition : Here is the default scheduling that I'm using

#pragma omp parallel for private(i) schedule(guided,1)
for (i = 0; i < nbre_file; i++)
   foobar(files[i]);  // according to the size of files(i) foobar can react as a sequential or a parallel program (which could induce nested loops)

See the image below

Default scheduling

In the image above the final time is the time spend to solve foobar sequentially on the biggest file file8.

I think that a better scheduling which will effectivelly deal with work balance could be to apply the computation foobar on big file in parallel. Like in the image below where tr i represent a thread.

enter image description here

such a way that the final time will be the one spend to solve foobar in parallel (in the image above we have used two threads !) on the biggest file file8

My question is :

it's possible to do such a scheduling with openmp ?

Thanks for any reply !

Upvotes: 2

Views: 407

Answers (1)

tune2fs
tune2fs

Reputation: 7705

Have you tried dynamic scheduling instead of guided?

If the normal scheduling clauses did not work for you you can try to do the parallelization of the loop by hand and assign the files to certain threads by hand. So your loop would look like this:

#pragma omp parallel
{
   id = omp_get_thread_num();
   if(id==0){ //thread 0
       for (i = 0; i < nbre_of_small_files; i++)
           foobar(files[i]); 
    }
    else { //thread 1 and 2 
       for (j = 0; j < nbre_of_big_files; j=j+2)
           if(id==1){//thread 1
               foobar(files[j]);
            } 
            else{ //thread 2
               foobar(files[j+1]); 
            }
    }

}

Here thread 0 does all the small files. Thread two and three do the big files.

Upvotes: 1

Related Questions