Mpi4py code does not stop running

Question

I am working with a very basic python code (filename: test_mpi.py) to try out parallel programming in python using mpi4py. What I am trying to do is to have a two dimensional numpy array with zeros for all entries. And then use specific processors in a cluster to increase the value of specific elements of the numpy array.

Specifically, I have a 3*3 numpy matrix (mat) which has all elements as zeros. After my code finishes running (across multiple processors), I want the matrix to look like this:

mat = [[ 1.  2.  3.]
       [ 4.  5.  6.]
       [ 7.  8.  9.]]

This is a fairly simple task and I expect my code to finish running inside a few minutes(if not lesser time). My code keeps running for a very long time and doesn't stop execution (ultimately I have to delete the job after many hours.)

This is my code:

from __future__ import division
from mpi4py import MPI
import os
import time
import numpy as np

comm = MPI.COMM_WORLD
nproc = comm.Get_size()
rank = comm.Get_rank()

start_time = time.time()

mat = np.zeros((3,3))

comm.bcast([ mat , MPI.DOUBLE], root=0)


for proc in range(1, nproc):
    if rank == proc:
        print "I'm processor: ", rank
        var = proc
        comm.send( var, dest=0, tag = (proc*1000) ) 
        print "Processor: ", rank, " finished working."


if rank == 0:
    print "Hello! I'm the master processor, rank: ", rank 
    for i in range(0,dim):
        for j in range(0, dim):
            proc = ((i*j)+1)
            mat[i,j] += comm.recv(source=proc, tag=(proc*1000) )


     np.savetxt('mat.txt', mat) 



print time.time() - start_time

This is my job script for execution of this python code:

#!/bin/sh

#PBS -l nodes=2:ppn=16
#PBS -N test_mpi4py
#PBS -m abe
#PBS -l walltime=168:00:00
#PBS -j eo
#PBS -q physics

cd $PBS_O_WORKDIR
export OMP_NUM_THREADS=16
export I_MPI_PIN=off
echo 'This job started on: ' `date`

/opt/intel/impi/2018.0.128/intel64/bin/mpirun -np 32 python test_mpi.py

I use qsub jobscriptname.sh to run the job script. What am I missing here? I will appreciate any help here.

Dima Chubarov · Accepted Answer

Your code did not finish because some of the MPI communications did not complete.

MPI requires that for every send there should be exactly one receive. Your first loop is executed by every MPI process rank independently, the condition rank == proc would be satisfied exactly once for each rank with the exception of 0 rank, therefore comm.send would be executed nproc - 1 times. Your second loop is executed dim * dim times. Therefore comm.recv would also be executed dim*dim times. Unless nproc - 1 == dim * dim. The requirement would not be satisfied and some recv or send operations would be waiting to complete indefinitely. For your example 31 != 9, so the communications would not complete until the walltime is exceeded.

In order to fix this error let us clarify the algorithm a bit. So we want to have each of the ranks from 1 to 9 to be responsible for one of the elements in a 3x3 matrix. Each process rank posts comm.send request. The requests are received in a certain order by process rank 0 and stored in the corresponding element of the matrix. The rest of the ranks if they are available do nothing.

Let us introduce three changes:

initialize the value for dim
move the conditional operator checking if we are processor rank 0 or not out of the loop
fix the computation of the rank corresponding to the element mat[i,j] which currently is not correct (e.g. for the central element mat[1,1] the rank should be 5, not 1 * 1 + 1 = 2)

Code

Here is what I got after the modifications:

from __future__ import division
from mpi4py import MPI
import os
import time
import numpy as np

comm = MPI.COMM_WORLD
nproc = comm.Get_size()
rank = comm.Get_rank()

start_time = time.time()

dim = 3
mat = np.zeros((dim,dim))

comm.bcast([ mat , MPI.DOUBLE], root=0)

if rank > 0:
    if rank <= dim * dim:
        print "I'm processor: ", rank
        var = rank
        req = comm.send( var, dest=0, tag = (rank*1000) )
    print "Processor: ", rank, " finished working."
else:
    print "Hello! I'm the master processor, rank: ", rank 
    for i in range(0,dim):
        for j in range(0, dim):
            proc = ((i*dim)+j)+1
            if proc < nproc:
                mat[i,j] += comm.recv(source=proc, tag=(proc*1000) )
    np.savetxt('mat.txt', mat)

Output

And here is the output:

mpirun -np 5 python mpi4.py

saves to mat.txt the following matrix

1.000000000000000000e+00 2.000000000000000000e+00 3.000000000000000000e+00
4.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00
0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00

And

mpirun -np 32 python mpi4.py

saves to mat.txt the following matrix

1.000000000000000000e+00 2.000000000000000000e+00 3.000000000000000000e+00
4.000000000000000000e+00 5.000000000000000000e+00 6.000000000000000000e+00
7.000000000000000000e+00 8.000000000000000000e+00 9.000000000000000000e+00

While 10 is the minimal number of process ranks that would produce the correct result.

Mpi4py code does not stop running

Answers (1)

Code

Output

Related Questions