user4085386
user4085386

Reputation:

Two MPI_Allreduce() functions not working, gives error of NULL Communicator

I am using an example code from an MPI book [will give the name shortly].

What it does is the following:

a) It creates two communicators world = MPI_COMM_WORLD containing all the processes and worker which excludes the random number generator server (the last rank process).

b) So, the server generates random numbers and serves them to the workers on requests from the workers.

c) What the workers do is they count separately the number of samples falling inside and outside an unit circle inside an unit square.

d) After sufficient level of accuracy, the counts inside and outside are Allreduced to compute the value of PI as their ratio.

**The code compiles well. However, when running with the following command (actually with any value of n) **

>mpiexec -n 2 apple.exe 0.0001

I get the following errors:

Fatal error in MPI_Allreduce: Invalid communicator, error stack:
MPI_Allreduce(855): MPI_Allreduce(sbuf=000000000022EDCC, rbuf=000000000022EDDC,
count=1, MPI_INT, MPI_SUM, MPI_COMM_NULL) failed
MPI_Allreduce(780): Null communicator
pi =  0.00000000000000000000
job aborted:
rank: node: exit code[: error message]
0: PC: 1: process 0 exited without calling finalize
1: PC: 123

Edit: ((( Removed: But when I am removing any one of the two MPI_Allreduce() functions, it is running without any runtime errors, albeit with wrong answer.))

Code:

#include <mpi.h>
#include <mpe.h>
#include <stdlib.h>

#define CHUNKSIZE 1000
/* message tags */
#define REQUEST 1
#define REPLY 2

int main(int argc, char *argv[])
{
    int iter;
    int in, out, i, iters, max, ix, iy, ranks [1], done, temp;
    double x, y, Pi, error, epsilon;
    int numprocs, myid, server, totalin, totalout, workerid;
    int rands[CHUNKSIZE], request;
    MPI_Comm world, workers;
    MPI_Group world_group, worker_group;
    MPI_Status status;
    MPI_Init(&argc,&argv);
    world = MPI_COMM_WORLD;
    MPI_Comm_size(world,&numprocs);
    MPI_Comm_rank(world,&myid);
    server = numprocs-1; /* last proc is server */
    if(myid==0) sscanf(argv[1], "%lf", &epsilon);
    MPI_Bcast(&epsilon, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);
    MPI_Comm_group(world, &world_group);
    ranks[0] = server;
    MPI_Group_excl(world_group, 1, ranks, &worker_group);
    MPI_Comm_create(world, worker_group, &workers);
    MPI_Group_free(&worker_group);
    if(myid==server)   /* I am the rand server */
    {
        srand(time(NULL));
        do
        {
            MPI_Recv(&request, 1, MPI_INT, MPI_ANY_SOURCE, REQUEST, world, &status);
            if(request)
            {
                for(i=0; i<CHUNKSIZE;)
                {
                    rands[i] = rand();
                    if(rands[i]<=INT_MAX) ++i;
                }
                MPI_Send(rands, CHUNKSIZE, MPI_INT,status.MPI_SOURCE, REPLY, world);
            }
        }
        while(request>0);
    }
    else   /* I am a worker process */
    {
        request = 1;
        done = in = out = 0;
        max = INT_MAX; /* max int, for normalization */
        MPI_Send(&request, 1, MPI_INT, server, REQUEST, world);
        MPI_Comm_rank(workers, &workerid);
        iter = 0;
        while(!done)
        {
            ++iter;
            request = 1;
            MPI_Recv(rands, CHUNKSIZE, MPI_INT, server, REPLY, world, &status);
            for(i=0; i<CHUNKSIZE;)
            {
                x = (((double) rands[i++])/max)*2-1;
                y = (((double) rands[i++])/max)*2-1;
                if(x*x+y*y<1.0) ++in;
                else ++out;
            }

            /* ** see error here ** */
            MPI_Allreduce(&in, &totalin, 1, MPI_INT, MPI_SUM, workers);
            MPI_Allreduce(&out, &totalout, 1, MPI_INT, MPI_SUM, workers);
            /* only one of the above two MPI_Allreduce() functions working */

            Pi = (4.0*totalin)/(totalin+totalout);
            error = fabs( Pi-3.141592653589793238462643);
            done = (error<epsilon||(totalin+totalout)>1000000);
            request = (done)?0:1;
            if(myid==0)
            {
                printf("\rpi = %23.20f", Pi);
                MPI_Send(&request, 1, MPI_INT, server, REQUEST, world);
            }
            else
            {
                if(request)
                    MPI_Send(&request, 1, MPI_INT, server, REQUEST, world);
            }
            MPI_Comm_free(&workers);
        }
    }
    if(myid==0)
    {
        printf("\npoints: %d\nin: %d, out: %d, <ret> to exit\n", totalin+totalout, totalin, totalout);
        getchar();
    }
    MPI_Finalize();
}

What is the error here? Am I missing something? Any help or pointer will be highly appreciated.

Upvotes: 0

Views: 583

Answers (1)

Hristo Iliev
Hristo Iliev

Reputation: 74395

You are freeing the workers communicator before you are done using it. Move the MPI_Comm_free(&workers) call after the while(!done) { ... } loop.

Upvotes: 1

Related Questions