Reputation: 1251
I am coding an implementation for Fox Algorithm with MPI in C. I already subdivised my global_matrix into smaller blocks. So each process has a little block of matrix A and matrix B. However I have trouble understanding how to implement Fox algorithm : a lot of code found on the internet are doing the following.
Implementation for Fox Algorithm
What I don't understand : In the last slides, there is C code that should implement the algorithm. But it seems that the temp array is never properly initialized and thus should result in weird behaviours when used in MPI_Bcast() and in the matrix multiplication.
I think I have the algo almost working but my result values are definitely wrong.
(I can provide code if you need)
Thanks for your answers !
Upvotes: 1
Views: 3036
Reputation: 1251
So I have found a solution to my problem almost after I asked the question. Just to be exhaustive, I've pushed the code to github. Remember that's a school project and that it's not completely finished and correct. Also the comments may be a bit weird : I'm not a native English speaker. My code on github
Nobilis had the answer : MPI_Bcast isn't just a function to send data but also to receive some. The MPI_Bcast should be called by every processus that should receive data and by the sender. That is if I write.
int* int_array = malloc(10*sizeof(int));
int root = 0;
if(my_rank == 0)
{
for(int i=0; i<10; ++i)
int_array[i] = i;
}
MPI_Bcast(int_array, 10, MPI_INT, root, MPI_COMM_WORLD);
This code means : for each processus started by MPI allocate 10 int. Then, only for processus of rank 0, put some valid data in the previous allocated array. Then each processus calls MPI_Bcast with the same arguments : the memory where data should be written (or from where it should be sent in the case of my_rank == 0), size and type of the data (is it an array or just one int ?), you define the root who will send the data to every process found in MPI_COMM_WORLD.
That's why we don't care if int_array isn't initialized in most process (except for the process having my_rank == root).
Note that you can use MPI_Datatype to send data in a specific layout. If so, you should be read about :
Hope that can help someone.
Upvotes: 1
Reputation: 7448
While not answering your original question can I just remark that MPI_Bcast
and matrixmult
both take tmp
as a first argument in the else
block, perhaps using it as a destination to store variables.
Without seeing how those two functions are implemented you can't know for sure whether tmp
is used unitialised.
Also malloc
-allocated memory can sometimes be 0 initalised though it's not behaviour I would rely on.
And finally, if you're going to use the code in the slides don't cast the result of malloc
.
Upvotes: 1