Silouane Gerin
Silouane Gerin

Reputation: 1251

Fox Algorithm with MPI

I am coding an implementation for Fox Algorithm with MPI in C. I already subdivised my global_matrix into smaller blocks. So each process has a little block of matrix A and matrix B. However I have trouble understanding how to implement Fox algorithm : a lot of code found on the internet are doing the following.

Implementation for Fox Algorithm

What I don't understand : In the last slides, there is C code that should implement the algorithm. But it seems that the temp array is never properly initialized and thus should result in weird behaviours when used in MPI_Bcast() and in the matrix multiplication.

I think I have the algo almost working but my result values are definitely wrong.

(I can provide code if you need)

Thanks for your answers !

Upvotes: 1

Views: 3036

Answers (2)

Silouane Gerin
Silouane Gerin

Reputation: 1251

So I have found a solution to my problem almost after I asked the question. Just to be exhaustive, I've pushed the code to github. Remember that's a school project and that it's not completely finished and correct. Also the comments may be a bit weird : I'm not a native English speaker. My code on github

Nobilis had the answer : MPI_Bcast isn't just a function to send data but also to receive some. The MPI_Bcast should be called by every processus that should receive data and by the sender. That is if I write.

int* int_array = malloc(10*sizeof(int));
int root = 0;

if(my_rank == 0)
{
    for(int i=0; i<10; ++i)
        int_array[i] = i;
}

MPI_Bcast(int_array, 10, MPI_INT, root, MPI_COMM_WORLD);

This code means : for each processus started by MPI allocate 10 int. Then, only for processus of rank 0, put some valid data in the previous allocated array. Then each processus calls MPI_Bcast with the same arguments : the memory where data should be written (or from where it should be sent in the case of my_rank == 0), size and type of the data (is it an array or just one int ?), you define the root who will send the data to every process found in MPI_COMM_WORLD.

That's why we don't care if int_array isn't initialized in most process (except for the process having my_rank == root).

Note that you can use MPI_Datatype to send data in a specific layout. If so, you should be read about :

  • MPI_Type_create_subarray
  • MPI_Type_create_resized
  • MPI_Type_commit

Hope that can help someone.

Upvotes: 1

Nobilis
Nobilis

Reputation: 7448

While not answering your original question can I just remark that MPI_Bcast and matrixmult both take tmp as a first argument in the else block, perhaps using it as a destination to store variables.

Without seeing how those two functions are implemented you can't know for sure whether tmp is used unitialised.

Also malloc-allocated memory can sometimes be 0 initalised though it's not behaviour I would rely on.

And finally, if you're going to use the code in the slides don't cast the result of malloc.

Upvotes: 1

Related Questions