Reputation: 1
The situation is this: I have an array of dimensions 4x4 and what I have to do is partition this matrix into "blocks"(aka smaller matrices) and distribute them to the "slave-processes". More specifically, suppose the total amount of processes is 4(1 master, 3 slaves and all will calculate what is to be calculated) which makes the partitioning of the 4x4 matrix into four 2x2 matrices. However besides making "buffers" of size 2x2 should be avoid(actually I want to avoid it). The question is: is there any "clever", more "painless" way to do manage it?
PS: I have to manage this problem http://www.cas.usf.edu/~cconnor/parallel/2dheat/2dheat.html meaning that a Cartesian Communicator will be created.
Upvotes: 0
Views: 681
Reputation: 22670
This is essentially how (plain) MPI works. The 2×2 matrices constitute a distributed data structure. Together, they comprise the actual 4×4 matrix. You could of course also use four 1×4 or 4×1 matrices, that does have some advantages (easier programming) and disadvantages (more communication needed when scaling).
In actual problems, such as a 2D heat equation, you often need to consider the halo around each local matrix. This halo is then exchanged during simulation step. Note that the code you linked uses full sized matrices on each worker rank. This is a simplification, but wastes resources and is thus not scalable.
MPI gives you some help to manage those distributed data, for instance via Cartesian communicators, or one-sided communication for easier halo-exchange, but essentially you have to manage the distributed data structure.
There are parallel paradigms that provide higher level abstractions of distributed data structures, but even an overview would IMHO bee too broad for this format. Many of those are related to the Partitioned Global Address Space (PGAS) concept. Implementations can range from new languages, language extensions (Co-array Fortran) to libraries and frameworks. Some use MPI internally.
Upvotes: 1