Reputation: 582
I am trying to make my for
loop parallel in c++. The iterations are completely independent. Below is a similar program that captures the idea of the task.
class A{
// create experiment
// perform experiment
// append results to file
// reset the experiment
};
main {
// open a file
// instance class
A a;
int N = 10000;
for ( int i = 0; i <= N; i++ ){
a.do_something()
}
// close file
// return
}
Each iteration will simply print its data to an output file, the order of this is unimportant too. Since a.do_something()
is lengthy, I would like to make it parallel. I have installed MPI
and am now somewhat familiar with its basic use.
My logic is to split the range N
into partitions depending on the number of processors available. I am looking for some assistance on how to take my serial version into parallel with MPI. My attempt is:
class A{
// create experiment
// perform experiment
// append results to file
// reset the experiment
};
main {
// open a file
// instance class
A a;
// initialise the MPI
int ierr = MPI_Init(&argc, &argv);
int procid, numprocs;
ierr = MPI_Comm_rank(MPI_COMM_WORLD, &procid);
ierr = MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
// partition = (job size) over (processors).
unsigned int partition = N / numprocs;
int N = 10000;
for ( int i = 0; i <= N; i++ ){
a.do_something()
}
ierr = MPI_Finalize();
// close file
// return
}
But I am really struggling to split the for loop and don't know how to proceed.
This will just run the serial code twice (on my 2-core machine). I want to split the for loop into N/2
chunks and have each thread tackle a different chunk.
Would I need to keep a core back to broadcast the jobs to the other cores? Could I iterate over the partitions? I have search online and haven't had much luck. Any suggestions?
Upvotes: 2
Views: 9945
Reputation: 810
A simple way to do that is :
for ( int i = 0; i <= N; i++ )
{
if (i% numprocs != procid) continue;
a.do_something()
}
Upvotes: 2
Reputation: 413
when the MPI part of the code starts, think of it as independent programms running on processor. This means that the loop you wrote is run independently on both processors. A way to split it would for example be
for ( int i = rank*partition; i <= rank*partition+partition; i++ )
{
a.do_something()
}
Also, declare N before you use it :-)
Upvotes: 5