ziv
ziv

Reputation: 189

creating MPI Derived Datatypes for an array of data

On every processor, I have a one dimensional_array which each part of that will be sent to different processors.
suppose it has the below shape:

double a[n];

if n=1200

the size of array is 1200*sizeof(d)   

e.g. I am running on 4 processors
the first 50 elements will be staying on the same processor owning them,

first 50 stays on rank  
next 100 goes to mod(rank+1,nproc)  
next 70  goes to mod(rank+2,nproc)  
next 80  goes to mod(rank+3,nproc)  

next 50*2  stays on rank  
next 100*2 goes  to mod(rank+1,nproc)  
next 70*2  goes  to mod(rank+2,nproc)  
next 80*2  goes  to mod(rank+3,nproc)  

next 50 stays on rank  
next 100 goes to mod(rank+1,nproc)  
next 70  goes to mod(rank+2,nproc)   
next 80  goes to mod(rank+3,nproc)  

in the next step the number could be different 2000

first 150 stays on rank  
next 100 goes to mod(rank+1,nproc)  
next 100  goes to mod(rank+2,nproc)  
next 150  goes to mod(rank+3,nproc)  

next 150*2  stays on rank  
next 100*2 goes  to mod(rank+1,nproc)  
next 100*2  goes  to mod(rank+2,nproc)  
next 150*2  goes  to mod(rank+3,nproc)  

next 150 stays on rank  
next 100 goes to mod(rank+1,nproc)  
next 100  goes to mod(rank+2,nproc)  
next 150  goes to mod(rank+3,nproc)  

and so on,

what is the best way of sending these data!. If I use the blocking send recv (which is the way currently I am using) I would create temporary array copy all the data that need to go to rank+1 and then send.

Is there a way to create a derived data type that I can send newtype to rank.

the data type that goes on mod(rank+1,nproc) on first step

[50 offset][100 d][200 offset][200d][350 offset][100d]

next 100 goes to mod(rank+1,nproc)  
... 
next 100*2 goes  to mod(rank+1,nproc)  
...  
next 100 goes to mod(rank+1,nproc)   

and in the next step

[150 offset][100 d][450 offset][200d][650 offset][100d]

Is there any suggestion how to create derived data type?

I can not create different data types for each rank (one for rank+1, the other for rank+2, ....), in case I have 1024 processors or more, I have to create 1024 data type at each iteration and destroy them at the end(from my previous experience creating many data types using Indexed datatype, or struct and freeing them at each iteration is expensive anybody knows more about this?)

Upvotes: 1

Views: 221

Answers (1)

suszterpatt
suszterpatt

Reputation: 8273

You write:

If I use the blocking send recv (which is the way currently I am using) I would create temporary array copy all the data that need to go to rank+1 and then send.

You wouldn't necessarily need to do that: you could simply send each block in a separate message. If performance is a concern, you could send them in a nonblocking way, and only block after the last one has been sent.

That said, if you want to send only one message per process, then the Indexed datatype is what you need. At the time of posting, this site has a good explanation, but here's the short version:

For the case in your example:

[50 offset][100 d][200 offset][200d][350 offset][100d]

You'd define a datatype as such (warning, untested code ahead):

double a[1000];
MPI_Datatype newType;
int numBlocks = 3;       // 3 blocks of data
int displacements[3];    // at what position does each block start
int blockLengths[3];     // how many elements does each block contain
displacements[0] = 50;   // 50 offset
blockLengths[0] = 100;   // 100 d
displacements[1] = 200;  // 200 offset
blockLengths[1] = 200;   // 200 d
displacements[2] = 350;  // 350 offset
blockLengths[2] = 100;   // 100 d

MPI_Type_Indexed(3, blockLengths, displacements, MPI_DOUBLE, &newType);
MPI_Type_commit(&newType);
MPI_Send(a, 1, newType, ...);

Note that you do not need to define a similar datatype on the receiving end: the receiving process can simply receive this message as an array of MPI_DOUBLEs, provided its receive buffer is large enough.

Also note that since the layout of your datatype is different for each rank and each iteration step, you must define and commit a similar datatype each time. You should also free the old types with MPI_Type_free().

Upvotes: 1

Related Questions