
Reputation: 9

MPI program in C crashing

My program is running and crashes at some point. After scouring over the code, I've come to the conclusion that I don't know enough to figure out why. Can someone offer some help? Below is main(). I'd be happy to post other source files, if you ask, just didn't want to post too much.

Thanks, Scott

int main(int argc, char *argv[])
//Global data goes here
    int rank, nprocs, i, j, k, rc, chunkSize; 
    double start, finish, difference;
    MPI_Status status;
    int *masterArray;
    int *slaveArray;
    int *subArray; 
    //Holder for subArrays for reassembly of subArrays
    int **arrayOfArrays; 
    //Beginning and ARRAYSIZE indices of array 
    Range range;

    //Begin execution

    //printf("%s", "Entering main()\n");
    MPI_Init(&argc, &argv); /* START MPI */

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    //printf("My rank %d\n", rank);

    MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
    //printf("Number of processes %d\n", nprocs);

    //Compute chunk size
    chunkSize = computeChunkSize(ARRAYSIZE, nprocs);
    //debug("%s: %d\n", "Chunk size", chunkSize);
    //                      N/#processes
    slaveArray = (int *)malloc(sizeof(int) * (chunkSize+1)); 

    //An array of int arrays (a pointer to pointers to ints)   
    arrayOfArrays = (int **)malloc(sizeof(int *) * (nprocs-1));

     ************************ MASTER id == 0 ************************

    /* MASTER: rank is 0. Problem decomposition- here simple matter of splitting 
    the master array evenly across the number of worker bees */
    if(rank == MASTER)
        debug("%s", "Entering MASTER process\n");

        //Begin timing the runtime of this application
        start = MPI_Wtime();
        debug("%s: %lg\n", "Start time", start);

        //Seed the random number generator
        //Create random array of ints for mpi processing        
        masterArray = createRandomArray();

        debug("%s %d %s %d %s\n", "Master array of random integers from ", BEGIN, " to ", ARRAYSIZE-1, "\n");

        /*Create the subArray to be sent to the slaves- malloc returns a pointer 
        to void, so explicitly coerce the pointer into the desired type with a cast */
        subArray = (int *)malloc(sizeof(int) * (chunkSize+1)); 

        //Initalize range
        range = (Range){.begin = 0, .end = (ARRAYSIZE/(nprocs-1))};  
        debug("%s %d %s %d\n", "Range: ", range.begin, " to ", range.end);

        //Master decomposes the problem set: begin and end of each subArray sent to slaves
        for(i = 1;i < nprocs; i++)
            //printf("%s", "Inside loop for Master send\n");

            range = decomposeProblem(range.begin, range.end, ARRAYSIZE, nprocs, i);

            debug("%s %d to %d%s", "Range from decomposition", range.begin, range.end, "\n");
            //Index for subArray
            k = 0;

            //Transfer the slice of the master array to the subArray
            for(j = range.begin; j < range.end; j++)
                subArray[k] = masterArray[j];
                //printf("%d\t", subArray[k]);
            //printf("%s", "\n");
            //Show sub array contents
            debug("%s", "Showing subArray before master sends...\n");
            showArray(subArray, 0, k);

            //printf("%s %d%s", "Send to slave", i, " from master \n");
            debug("%s %d%s", "Send to slave", i, " from master \n");            
            ************************ MASTER: SEND **************************
            rc = MPI_Send(&subArray, chunkSize, MPI_INT, i, 0, MPI_COMM_WORLD);
        //Blocks until the slaves finish their work and start sending results back to master
        /*MPI_Recv is "blocking" in the sense that when the process (in this case 
        my_rank == 0) reaches the MPI_Recv statement, it will wait until it 
        actually receives the message (another process sends it). If the other process 
        is not ready to Send, then the process running on my_rank == 0 will simply 
        remain idle. If the message is never sent, my_rank == 0 will wait a very long time!*/
        for(i = 1;i < nprocs; i++)
            debug("%s %d%s ", "Receive from slave", i, " to master\n");         
            ************************ MASTER: RECEIVE ***********************
            debug("Rank %d approaching master MPI_Probe.\n", rank);
            // Probe for an incoming message from process zero
            MPI_Probe(rank, 0, MPI_COMM_WORLD, &status);
            debug("Rank %d going by MPI_Probe.\n", rank);

            // When probe returns, the status object has the size and other
            // attributes of the incoming message. Get the size of the message
            MPI_Get_count(&status, MPI_INT, &chunkSize);

            rc = MPI_Recv(&slaveArray, chunkSize, MPI_INT, i, 0, MPI_COMM_WORLD, &status);

            debug("Slave %d dynamically received %d numbers from 0.\n", rank, chunkSize);
            //Store subArray in 2D array
            debug("%s", "Storing subArray in 2DArray...\n");

            arrayOfArrays[i-1] = slaveArray;
        //rebuild entire sorted array from sorted subarrays
        //starting with smallest value, validate that each element is <= next element

        //Finish timing the runtime of this application 
        finish = MPI_Wtime();
        //Compute the runtime
        difference = finish-start;
        //Inform user
        debug("%s", "Exiting MASTER process\n");
        debug("%s %lg", "Time for completion:", difference);
     ************************* End MASTER ***************************

     ************************ SLAVE id > 1 **************************
        debug("%s", "Entering SLAVE process\n");
        //by process id
        debug("%s %d%s", "Receive in slave", rank, " from master \n");

        debug("Rank %d approaching Slave MPI_Probe.\n", rank);
        // Probe for an incoming message from process zero

        MPI_Probe(MPI_ANY_SOURCE, 0, MPI_COMM_WORLD, &status);
        debug("Rank %d going by Slave MPI_Probe.\n", rank);
        // When probe returns, the status object has the size and other
        // attributes of the incoming message. Get the size of the message
        MPI_Get_count(&status, MPI_INT, &chunkSize);
        debug("Count %d and chunkSize %d after Slave MPI_Get_count.\n", rank, chunkSize);
         ******************** SLAVE: RECEIVE ***************************
        rc = MPI_Recv(&subArray, chunkSize, MPI_INT, 0, 0, MPI_COMM_WORLD, &status);
        debug("%d dynamically received %d numbers from 0.\n", rank, chunkSize);

        /*Store the received subArray in the slaveArray for processing and sending back
            to master*/ 
        slaveArray = subArray;

        //Take a look at incoming subArray: size = N/#processes)
        debug("%s ", "Show the slaveArray contents in slave receive\n");
        debug("Before bubblesort: start %d, finish: %d\n", (rank-1) * chunkSize, rank * chunkSize);

        //showArray(slaveArray, (rank-1) * chunkSize, rank * chunkSize);
        //Running the actual sorting algorithm on the current slaves subArray
        //bubble(slaveArray, ARRAYSIZE);
        //Return sorted subArray back to the master by process id

        debug("%s %d%s", "Send from slave", i, " to master \n");

         ************************ SLAVE: SEND ***************************
        rc = MPI_Send(&slaveArray, chunkSize, MPI_INT, 0, 0, MPI_COMM_WORLD);
        debug("%s", "Exiting SLAVE process\n");
     ************************* END SLAVE ****************************
    //Clean up memory
    rc = MPI_Get_count(&status, MPI_INT, &chunkSize);
    debug("Process %d: received %d int(s) from process %d with tag %d \n", rank, chunkSize, status.MPI_SOURCE, status.MPI_TAG);
    /* EXIT MPI */
    debug("%s", "Exiting main()\n");
    return 0;

Upvotes: 0

Views: 843

Answers (2)


Reputation: 9

Ok, maybe it'd be easier to show specific moments in the code to help me figure this out. I tried to create a function that creates an int * array passed in by reference that tests whether array is null and whether it is the size I want it to be. Below that is the caller. One thing I noticed is that the sizeof(buffer) call doesn't return what I was thinking it would. So, how else can I make that check? Also, the caller, createRandomArray, is called by having an int * passed into it. Can you pass by reference as deep as you want? Am I using the correct syntax to make sure that the masterArray gets populated in the caller (main()) with call by reference?

void safeMalloc(int *buffer, int size, int line_num)
    buffer = (int *)malloc(sizeof(int) * size);
    //Test that malloc allocated at least some memory
    if(buffer == NULL) 
        debug("ERROR: cannot allocate any memory for line %d\n", line_num);
        debug("Successfully created the array through malloc()\n");
    //Test that malloc allocated the correct amount of memory
    if(sizeof(buffer) != size)
        debug("ERROR: Created %d bytes array instead of %d bytes through malloc() on line %d.\n", sizeof(buffer), size, line_num);

void createRandomArray(int *masterArray)
    int i;
    debug("Entering createRandomArray()\n");
    safeMalloc(masterArray, ARRAYSIZE, 21);
    for(i = BEGIN;i < ARRAYSIZE;i++)
        masterArray[i] = (rand() % (ARRAYSIZE - BEGIN)) + BEGIN;
        debug("%d ", masterArray[i]);
    debug("\n Exiting createRandomArray()\n");

Upvotes: 0


Reputation: 141564

Check that chunkSize >= 0, nProcs >= 2, and that malloc does not return null. I mean, add code to do this every time and for every malloc, and exit if these conditions are not true -- not just put in temporary debugging.

This loop might overflow bounds:

for(j = range.begin; j < range.end; j++)
    subArray[k] = masterArray[j];

You didn't show the code where masterArray is allocated. (and you didn't pass nprocs to that function either, so how can it match up with ARRAYSIZE/(nprocs-1) ?

Also, subArray has chunkSize+1 elements, but range.end is defined as ARRAYSIZE/(nprocs-1). Based on the code you've shown (which doesn't include ARRAYSIZE, nor how chunkSize and nprocs are actually calculated), there's no reason to believe that we will always have chunkSize+1 <= ARRAYSIZE/(nprocs-1).

To avoid random segfaults, you should always, always check that an array index is within the bounds of an array , before using the [] operator.

Upvotes: 1

Related Questions