paeion
paeion

Reputation: 35

mpi4py suddenly not compatible with openmpi on ubuntu

My mpi4py (3.1.5) installation with openmpi (4.1.4) on python3.8 and ubuntu 20.04 has randomly stopped working today. Whenever I execute anything that loads mpi4py in python, I get the following error:

[juanMS:15643] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ess_singleton_module.c at line 572
[juanMS:15643] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ess_singleton_module.c at line 172
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_ess_init failed
  --> Returned value A system-required executable either could not be found or was not executable by this user (-126) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_mpi_init: ompi_rte_init failed
  --> Returned "A system-required executable either could not be found or was not executable by this user" (-126) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
[juanMS:15643] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!

This is really frustrating because I did not make any system changes or package updates or anything like that. I have tried removing all openmpi packages on my system and python venv:

sudo apt purge --autoremove libopenmpi-dev libopenmpi3 mpich openmpi-bin openmpi-common

pip uninstall mpi4py

I have tried this multiple times and for some reason the same error keeps popping up. There is nothing wrong that I can see with my openmpi version, as a simple test like this works fine:

mpirun -np 4 hostname 

I have found virtually no help online, so I'm hoping someone here can guide me in the right direction!

EDIT --------------------------------------------------------------

Exemplar python script to reproduce the error with mpi4py=3.1.5, openmpi=4.1.4, and python3.8 from ubuntu 20.04:

from mpi4py import MPI
import sys

def print_hello(rank, size, name):
    msg = "Hello World! I am process {0} of {1} on {2}.\n"
    sys.stdout.write(msg.format(rank, size, name))

if __name__ == "__main__":
    size = MPI.COMM_WORLD.Get_size()
    rank = MPI.COMM_WORLD.Get_rank()
    name = MPI.Get_processor_name()

    print_hello(rank, size, name)

It appears mpi4py=3.1.5 is compatible with openmpi=4.0.X and mpich=3.3.2, as far as I have tested.

Upvotes: 0

Views: 342

Answers (1)

I had a similar issue and solved it using:

sudo apt purge --autoremove libopenmpi-dev libopenmpi3 mpich openmpi-bin openmpi-common
sudo pip uninstall mpi4py

Followed by:

sudo apt install libopenmpi-dev libopenmpi3 mpich openmpi-bin openmpi-common
sudo pip install mpi4py

Upvotes: 0

Related Questions