Reputation: 21
I've divided matrix by blocks and multiplied it using Fox's algorithm.
How can I print the result matrix to screen, when that is stored by blocks in different processes, without sending these blocks back to the process with rank 0?
For example.
After multiplication I've got:
Block A:
83 64
112 76
Block B:
118 44
152 34
Block C:
54 68
67 56
Block D:
89 85
114 68
Entire matrix should look like:
83 64 118 44
112 76 152 34
54 68 89 85
67 56 114 68
So far I've made:
Send two blocks that contain one row and print it to screen. But is it possible to print entire result matrix without sending more than one block to process 0?
// Function for gathering the result matrix
// pCBlock - one block containing part of entire result matrix
// Size - matrix dimension
// BlockSize - block dimension
void ResultCollection(double* pCblock, int Size,
int BlockSize) {
double * pResultRow = new double[Size*BlockSize];
for (int i = 0; i<BlockSize; i++) {
MPI_Gather(&pCblock[i*BlockSize], BlockSize, MPI_DOUBLE,
&pResultRow[i*Size], BlockSize, MPI_DOUBLE, 0, RowComm);
}
//print two matrix rows from two blocks
delete[] pResultRow;
}
This can't help
( Ordering Output in MPI )
because for the matrix output I need to print not the entire block A
, than B
, than C
, than D
,
but rather
one line from A
( in process 0 ), one line from B
( from process 1 ),
one line from A
( in process 0 ), one line from B
( from process 1 ),
one line from C
( from process 2 ), one line from D
( from process 3 )
and etc.
Upvotes: 1
Views: 378
Reputation: 1
Well, it is time to realise,
that unless the process with rank 0
was equipped with some sort of clairvoyance, it will never be able to pretty-print any results, that were remotely computed in a herd of decentralised, distributed-processes.
Similarly, it is easy to test,
if you still do not believe what has been published on this, that MPI-distributed code was never promised to have any weak/strong warranty of how the principally uncoordinated delivery of any asynchronously remote-printed character-streams will centrally got ad-hoc ordered into one common serial output -- the system stdout
-- and finally put onto the screen.
Even if you would play a lot with "addressable-ANSI-coded-screen", such design-efforts will not yield any universally working code and the tricks to inject an "absolute"-addressing into the ANSI-coded output would be obsessively awful both to implement and to operate so as to paint a result on screen correctly.
Your actual MPI-infrastructure advisors / admins will for sure help you and show you appropriate tools for smart-collecting the results and post-process them accordingly.
Upvotes: 0