garbage_collector
garbage_collector

Reputation: 103

MPI4PY: Scatter a matrix

I am using MPI4PY to scatter n/p columns to two input data processes. However, I am unable to send the columns as I would like. What changes do I have to make to the code in order to get the result reported in the final comment?

The matrix is:

[1, 2, 3, 4]
[5, 6, 7, 8]
[9, 10, 11, 12]
[13, 14, 15, 16]

Then, n=4 and p=2. Each process will have 2 columns respectively.

This is my code:

# Imports
from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
size = comm.Get_size() 
rank = comm.Get_rank()

rows = 4
num_columns = rows/size

data=None

if rank == 0:
  data = np.matrix([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])

recvbuf = np.empty((rows, int(num_columns)), dtype='int')
comm.Scatterv(data, recvbuf, root=0)
print('Rank: ',rank, ', recvbuf received:\n ',recvbuf)

I get the following output:

Rank:  0 , recvbuf received:
[[1 2]
[3 4]
[5 6]
[7 8]]
Rank:  1 , recvbuf received:
[[ 9 10]
[11 12]
[13 14]
[15 16]]

I want to get the following output instead:

Rank:  0 , recvbuf received:
[[1 2]
[5 6]
[9 10]
[13 14]]
Rank:  1 , recvbuf received:
[[ 3 4]
[7 8]
[11 12]
[15 16]]

Upvotes: 1

Views: 2133

Answers (3)

Alan Pearl
Alan Pearl

Reputation: 11

The previous responses are good starting points, but the scatter_nd function below is a solution that is generalized so that:

  • We can split on any specified axis (in this example, I split on axis=1, as desired in the original post)
  • Uneven division is allowed (in this example, I split a length-5 axis into two parts)
  • Freedom to specify any MPI communicator and the root (where the data is located)

test_scatter_nd.py

import numpy as np
from mpi4py import MPI


def scatter_nd(array, axis=0, comm=MPI.COMM_WORLD, root=0):
    """Scatter n-dimensional array from root to all ranks"""
    ans = None
    if comm.rank == root:
        splits = np.array_split(array, comm.size, axis=axis)
        for i in range(comm.size):
            if i == root:
                ans = splits[i]
            else:
                comm.send(splits[i], dest=i)
    else:
        ans = comm.recv(source=root)
    return ans


arr = None
if MPI.COMM_WORLD.rank == 0:
    arr = np.array([[1,  2,  3,  4,  5],
                    [6,  7,  8,  9,  10],
                    [11, 12, 13, 14, 15],
                    [16, 17, 18, 19, 20],
                    [21, 22, 23, 24, 25]])

arr = scatter_nd(arr, axis=1)
print(arr, "shape:", arr.shape, "rank:", MPI.COMM_WORLD.rank)

Executing mpiexec -n 2 python test_scatter_nd.py returns:

[[ 1  2  3]
 [ 6  7  8]
 [11 12 13]
 [16 17 18]
 [21 22 23]] shape: (5, 3) rank: 0
[[ 4  5]
 [ 9 10]
 [14 15]
 [19 20]
 [24 25]] shape: (5, 2) rank: 1

Upvotes: 1

TheIdealis
TheIdealis

Reputation: 707

I have a slightly different answer using send and recv. Since I don't always end up with a matrix that can be distributed evenly, I allow for different slice sizes in every process. This might also be possible for Scatterv but I think using send and recv is sometimes easier to handle:

import numpy as np
from mpi4py import MPI

comm = MPI.COMM_WORLD
Np = comm.Get_size()
p = comm.Get_rank()


borders = np.array([0, 1, 2, 3, 5])

if p==0:
    arr = np.array([[1, 2, 3, 4, 5],
                    [6, 7, 8, 9, 0],
                    [4, 2, 3, 4, 5],
                    [6, 7, 8, 9, 1],
                    [2, 2, 3, 4, 5]])

    for i in range(1, Np):
        ps, pe = borders[i], borders[i+1]
        comm.send(arr[ps:pe], dest=i, tag=1)
    ps, pe = borders[0], borders[1]
    arr = arr[ps:pe]

else:
    arr = comm.recv(source=0, tag=1)

print(p, arr)

Upvotes: 0

Joe Todd
Joe Todd

Reputation: 897

I think this code does what you are looking for. The issue here is that Scatterv doesn't care about numpy array shape at all, it just considers a linear block of memory containing your values. Therefore, the simplest approach is to manipulate your data into the correct order beforehand. Note that send_data is a 1D array, but this doesn't matter because Scatterv doesn't care. At the other end, the shape of recvbuf is already defined, and Scatterv just fills it up from the 1D input received.

# Imports
from mpi4py import MPI
import numpy as np

comm = MPI.COMM_WORLD
size = comm.Get_size()
rank = comm.Get_rank()

rows = 4
num_cols = rows/size

send_data=None

if rank == 0:
  data = np.matrix([[1, 2, 3, 4],
                    [5, 6, 7, 8],
                    [9, 10, 11, 12],
                    [13, 14, 15, 16]])

  # Split into sub-arrays along required axis
  arrs = np.split(data, size, axis=1)

  # Flatten the sub-arrays
  raveled = [np.ravel(arr) for arr in arrs]

  # Join them back up into a 1D array
  send_data = np.concatenate(raveled)


recvbuf = np.empty((rows, int(num_cols)), dtype='int')
comm.Scatterv(send_data, recvbuf, root=0)

print('Rank: ',rank, ', recvbuf received:\n ',recvbuf)

Upvotes: 1

Related Questions