Mircea
Mircea

Reputation: 1999

What is the easier way to split an array

I have one problem at that point when I tried to split an array into some subarrays.

To be more exactly I have an array, let's say int a[10]={1,3,2,7,8,12,5,7,68,10} and I'm running my program on X process (in this moment I'm using 8 but could be more or less).

And I want to sent to each process on part of this array, for example for my array in this moment each process will receive something like process0 = {1, 3}, process2 = {2, 7} and so on.. until process7 = 68, 10.

After I've send each subarray I will do some operations on each subarray and after I want to merge all my subarrays into one back.

I've search on google a lot and I saw some example using MPI_Send and MPI_Recv or MPI_Scatter and MPI_Gather and I've tried some methods but everything I've tried... was without success and I receive errors or null pointer...

My Code:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define N 32 
int A[N]; 

int main(int argc, char *argv[]) {

    int size;
    int rank;
    const int ROOT = 0; 

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    int count = N / (size - 1);

    int *localArray = (int *) malloc(count * sizeof(int));

    if (rank == ROOT) {
        for (int i = 0; i < N; i++) {
            A[i] = rand() % 10;
        }

        for (int dest = 1; dest < size; ++dest) {
            MPI_Send(&A[(dest - 1) * count], count, MPI_INT, dest, tag, MPI_COMM_WORLD);
            printf("P0 sent a %d elements to P%d.\n", count, dest);
        }

        for (int source = 1; source < size; source++) {
            MPI_Recv(localArray, count, MPI_INT, source, 2, MPI_COMM_WORLD, MPI_STATUS_IGNORE);

            //--------------------------------MERGE THE ALL RESULTS INTO A SORTED ARRAY-------------------------------------
            printf("Received results from task %d\n", source);
        }
    }
    else {
        MPI_Recv(localArray, count, MPI_INT, ROOT, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
        //---------------SORT THE localArray-------------------
        MPI_Send(localArray, count, MPI_INT, ROOT, tag, MPI_COMM_WORLD);
    }
    MPI_Finalize();
    return 0;
}

What ever I've tried I can't get results where I've put the comment, what I'm doing wrong

Upvotes: 2

Views: 3777

Answers (2)

Zulan
Zulan

Reputation: 22660

While dreamcrash already suggested you could clean up your code using scatter & gather, I would put more emphasis on this. Use the built-in collective operations wherever possible. Do not try to rebuild them on your own. Not only is the code cleaner and easier to understand, it will also be significantly faster and allow all sorts of optimizations by the MPI implementation. Your example (assuming N is divisible by size) becomes:

if (rank == ROOT) {
    for (int i = 0; i < N; i++) {
        A[i] = rand() % 10;
    }
}

MPI_Scatter(A, count, MPI_INT, localArray, count, MPI_INT, ROOT, MPI_COMM_WORLD);
//---------------SORT THE localArray-------------------
MPI_Gather(localArray, count, MPI_INT, A, count, MPI_INT, ROOT, MPI_COMM_WORLD);

MPI_Finalize();

Note that the ROOT rank correctly participates in the computation and does send data to itself using scatter / gather without any additional code path.

Now since your example explicitly uses N=10, which is not divisible by size=8, here is a version that works correctly. The idea is to distribute the remainder of the integer division evenly across the first remainder ranks (each gets one additional element to work on). You have to do that irregardless of using send/recv or scatter/gather. With scatter/gather you use the MPI_Scatterv / MPI_Gatherv variants, which take an array of sendcounts (how much elements does each rank get) and displacements (offset of each local part within the global one):

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define N 32 
int A[N]; 

int main(int argc, char *argv[]) {
    int size;
    int rank;
    const int ROOT = 0; 

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    // compute the work distribution
    int remainder = N % size;
    int local_counts[size], offsets[size];
    int sum = 0;
    for (int i = 0; i < size; i++) {
        local_counts[i] = N / size;
        if (remainder > 0) {
            local_counts[i] += 1;
            remainder--;
        }
        offsets[i] = sum;
        sum += local_counts[i];
    }

    int localArray[local_counts[rank]];

    if (rank == ROOT) {
        for (int i = 0; i < N; i++) {
            A[i] = rand() % 10;
        }
    }

    MPI_Scatterv(A, local_counts, offsets, MPI_INT, localArray, local_counts[rank], MPI_INT, ROOT, MPI_COMM_WORLD);

    //---------------SORT THE localArray-------------------

    MPI_Gatherv(localArray, local_counts[rank], MPI_INT, A, local_counts, offsets, MPI_INT, ROOT, MPI_COMM_WORLD);

    MPI_Finalize();
    return 0;
}

Upvotes: 3

dreamcrash
dreamcrash

Reputation: 51393

Change your code for something like this:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

#define N 32 
int A[N];  // this should be global

int main(int argc, char *argv[]) {

    int size;
    int rank;
    const int VERY_LARGE_INT = 999999;
    const int ROOT = 0; 
    int tag = 1234;

    MPI_Init(&argc, &argv);

    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    int count = N / size ;

    int *localArray = (int *) malloc(count * sizeof(int));
    int localMin;  // minimum computed on rank i
    int globalMin; // will only be valid on rank == ROOT

    if (rank == ROOT) {
        for (int i = 0; i < N; i++) {
            A[i] = rand() % 10;
        }
        // master local copy
        for (int i = 0; i < count; i++)
            localArray[i] = A[i];

        for (int dest = 1; dest < size; ++dest) {
            MPI_Send(&A[dest* count], count, MPI_INT, dest, tag, MPI_COMM_WORLD);
            printf("P0 sent a %d elements to P%d.\n", count, dest);
        }
        localMin = VERY_LARGE_INT;

    for (int source = 1; source < size; source++) 
    {
        MPI_Recv(localArray, count, MPI_INT, source, 2, MPI_COMM_WORLD, 
                MPI_STATUS_IGNORE);

        //--------------------------------I CANT GET RESULT HERE-------------------------------------
        printf("Received results from task %d\n", source);
    }

    }
    else 
    {
        MPI_Recv(localArray, count, MPI_INT, ROOT, tag, MPI_COMM_WORLD, MPI_STATUS_IGNORE);

       //.. do something
       MPI_Send(localArray, count, MPI_INT, ROOT, 2, MPI_COMM_WORLD);
    }
    MPI_Finalize();
    return 0;
} 

Some mistakes:

  • Array A is global, therefore all processes will have it, you most likely want to only allocate it for the master process;

  • I changed N / (size - 1) to N / size, however be aware that this only works when N %% size == 0, thus you might want to deal with opposed scenario.

  • Since the master will have a sub-copy of the global array, I am performing this local copy from A to local array before sending the data to the slaves :

    // master local copy
    for (int i = 0; i < count; i++)
        localArray[i] = A[i];
    
  • You have a small mistake on the merging part, the master and the slaves are using different tags, that was causing a deadlock. That is why I also changed this:

    MPI_Send(localArray, count, MPI_INT, ROOT, tag, MPI_COMM_WORLD);

    to

    MPI_Send(localArray, count, MPI_INT, ROOT, 2, MPI_COMM_WORLD);

    Both have now the same tag (2);

  • You could implement this code with scatter and gather and it would be a lot cleaner see here some examples.

Another mirror issue is if you are using C language instead of int *localArray = (int *) malloc(count * sizeof(int)); you should do int *localArray = malloc(count * sizeof(int)); see here why.

Upvotes: 0

Related Questions