Reputation: 35
My mpi4py (3.1.5) installation with openmpi (4.1.4) on python3.8 and ubuntu 20.04 has randomly stopped working today. Whenever I execute anything that loads mpi4py
in python, I get the following error:
[juanMS:15643] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ess_singleton_module.c at line 572
[juanMS:15643] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ess_singleton_module.c at line 172
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_ess_init failed
--> Returned value A system-required executable either could not be found or was not executable by this user (-126) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
ompi_mpi_init: ompi_rte_init failed
--> Returned "A system-required executable either could not be found or was not executable by this user" (-126) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[juanMS:15643] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
This is really frustrating because I did not make any system changes or package updates or anything like that. I have tried removing all openmpi packages on my system and python venv:
sudo apt purge --autoremove libopenmpi-dev libopenmpi3 mpich openmpi-bin openmpi-common
pip uninstall mpi4py
I have tried this multiple times and for some reason the same error keeps popping up. There is nothing wrong that I can see with my openmpi version, as a simple test like this works fine:
mpirun -np 4 hostname
I have found virtually no help online, so I'm hoping someone here can guide me in the right direction!
EDIT --------------------------------------------------------------
Exemplar python script to reproduce the error with mpi4py=3.1.5, openmpi=4.1.4, and python3.8 from ubuntu 20.04:
from mpi4py import MPI
import sys
def print_hello(rank, size, name):
msg = "Hello World! I am process {0} of {1} on {2}.\n"
sys.stdout.write(msg.format(rank, size, name))
if __name__ == "__main__":
size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
name = MPI.Get_processor_name()
print_hello(rank, size, name)
It appears mpi4py=3.1.5 is compatible with openmpi=4.0.X and mpich=3.3.2, as far as I have tested.
Upvotes: 0
Views: 342
Reputation: 1
I had a similar issue and solved it using:
sudo apt purge --autoremove libopenmpi-dev libopenmpi3 mpich openmpi-bin openmpi-common
sudo pip uninstall mpi4py
Followed by:
sudo apt install libopenmpi-dev libopenmpi3 mpich openmpi-bin openmpi-common
sudo pip install mpi4py
Upvotes: 0