Ali.Nemat
Ali.Nemat

Reputation: 49

MPI_collective communication

I try to code quick sort in mpi. the algorithm of parallelization is simple. the root scatters the list in MPI_comm_world. then each node executes qsort() function on it`s subarray. the MPI_gathers() is used to give all subarrays back to root to execute again qsort on it.so simple. however i get the error. i guessed that maybe the size of sub-arrays is not exact. because it simply divides the size of list by comm_size. so it is likely that there would be a segmentation fault. however i give the size of list 1000 and the number of processors 4. the result of division is 250. so there should be no segmentation fault. But there is. could you tell me where i am wrong.

int main()
{
    int array [1000];
    int arrsize;
    int chunk;
    int* subarray;
    int rank ;
    int comm_size;
    MPI_Init(NULL,NULL);
    MPI_Comm_size(MPI_COMM_WORLD,&comm_size);
    MPI_Comm_rank(MPI_COMM_WORLD,&rank);
    if(rank==0)
    {
        time_t t;
        srand((unsigned)time(&t));
        int arrsize = sizeof(array) / sizeof(int);
        for (int i = 0; i < arrsize; i++)
            array[i] = rand() % 1000;
        printf("\n this is processor %d and the unsorted array is:",rank);
        printArray(array,arrsize);          
    }

    MPI_Scatter( array,arrsize,MPI_INT, subarray,chunk,MPI_INT,0,MPI_COMM_WORLD);
    chunk = (int)(arrsize/comm_size);
    subarray = (int*)calloc(arrsize,sizeof(int));

    if(rank != 0)
    {
        qsort(subarray,chunk,sizeof(int),comparetor);
    }

    MPI_Gather( subarray,chunk, MPI_INT,array, arrsize, MPI_INT,0, MPI_COMM_WORLD);
    if(rank==0)
    {
        qsort(array,arrsize,sizeof(int),comparetor);
        printf("\n this is processor %d and this is sorted array: ",rank);
        printArray(array,arrsize);
    }
    free(subarray);
    MPI_Finalize();
    return 0;
}

and the error says :

Invalid MIT-MAGIC-COOKIE-1 key[h:04865] *** Process received signal ***
[h:04865] Signal: Segmentation fault (11)
[h:04865] Signal code: Address not mapped (1)
[h:04865] Failing at address: 0x421e45
[h:04865] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x46210)[0x7f1906b29210]
[h:04865] [ 1] /lib/x86_64-linux-gnu/libc.so.6(+0x18e533)[0x7f1906c71533]
[h:04865] [ 2] /lib/x86_64-linux-gnu/libopen-pal.so.40(+0x4054f)[0x7f190699654f]
[h:04865] [ 3] /lib/x86_64-linux-gnu/libmpi.so.40(ompi_datatype_sndrcv+0x51a)[0x7f1906f3288a]
[h:04865] [ 4] /lib/x86_64-linux-gnu/libmpi.so.40(ompi_coll_base_scatter_intra_basic_linear+0x12c)[0x7f1906f75dec]
[h:04865] [ 5] /lib/x86_64-linux-gnu/libmpi.so.40(PMPI_Scatter+0x10d)[0x7f1906f5952d]
[h:04865] [ 6] ./parallelQuickSortMPI(+0xc8a5)[0x5640c424b8a5]
[h:04865] [ 7] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f1906b0a0b3]
[h:04865] [ 8] ./parallelQuickSortMPI(+0xc64e)[0x5640c424b64e]
[h:04865] *** End of error message ***
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node h exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Upvotes: 0

Views: 149

Answers (1)

j23
j23

Reputation: 3530

The reason for segmentation fault is in below lines.

MPI_Scatter( array,arrsize,MPI_INT, subarray,chunk,MPI_INT,0,MPI_COMM_WORLD);
chunk = (int)(arrsize/comm_size);
subarray = (int*)calloc(arrsize,sizeof(int));

You are only allocating the subarray as well as calculating the chunk size only after the MPI_Scatter operation. It is a collective operation and necessary memory allocation (eg: receiver array) as well as size to receive should be declared and defined before the call.

chunk = (int)(arrsize/comm_size);
subarray = (int*)calloc(arrsize,sizeof(int));
MPI_Scatter( array,arrsize,MPI_INT, subarray,chunk,MPI_INT,0,MPI_COMM_WORLD);

Above is the right way. You will move pass the segmentation fault with this change.

Upvotes: 0

Related Questions