ljofre
ljofre

Reputation: 324

How to Consume an mpi4py application from a serial python script

I tried to make a library based on mpi4py, but I want to use it in serial python code.

$ python serial_source.py

but inside serial_source.py exists some function called parallel_bar

from foo import parallel_bar
# Can I to make this with mpi4py like a common python source code?
result = parallel_bar(num_proc = 5)

The motivation for this question is about finding the right way to use mpi4py to optimize programs in python which were not necessarily designed to be run completely in parallel.

Upvotes: 5

Views: 1874

Answers (2)

NOhs
NOhs

Reputation: 2830

This is indeed possible and is in the documentation of mpi4py in the section Dynamic Process Management. What you need is the so called Spawn functionality which is not available with MSMPI (in case you are working with Windows) see also Spawn not implemented in MSMPI.

Example

The first file provides a kind of wrapper to your function to hide all the MPI stuff, which I guess is your intention. Internally it calls the "actual" script containing your parallel code in 4 newly spawned processes.

Finally, you can open a python terminal and call:

from my_prog import parallel_fun

parallel_fun()
# Hi from 0/4
# Hi from 3/4
# Hi from 1/4
# Hi from 2/4
# We got the magic number 6

my_prog.py

import sys
import numpy as np
from mpi4py import MPI

    def parallel_fun():
        comm = MPI.COMM_SELF.Spawn(
            sys.executable,
            args = ['child.py'],
            maxprocs=4)

        N = np.array(0, dtype='i')

        comm.Reduce(None, [N, MPI.INT], op=MPI.SUM, root=MPI.ROOT)

        print(f'We got the magic number {N}')

Here the child file with the parallel code:

child.py

from mpi4py import MPI
import numpy as np


comm = MPI.Comm.Get_parent()

print(f'Hi from {comm.Get_rank()}/{comm.Get_size()}')
N = np.array(comm.Get_rank(), dtype='i')

comm.Reduce([N, MPI.INT], None, op=MPI.SUM, root=0)

Upvotes: 7

Mark
Mark

Reputation: 93

Unfortunately I don't think this is possible as you have to run the MPI code specifically with mpirun.

The best you can do is the opposite where you write generic chunks of code which can be called either by an MPI process or a normal python process.

The only other solution is to wrapper the whole MPI part of your code into an external call and call it with subprocess in your non MPI code, however this will be tied to your system configuration quite heavily, and is not really that portable.

Subprocess is detailed in this thread Using python with subprocess Popen, and is worth a look, the complexity here is making the correct call in the first place i.e

command = "/your/instance/of/mpirun /your/instance/of/python your_script.py -arguments"

And then getting the result back into your single threaded code, which dependent on size there are many ways, but something like parallel hdf5 would be a good place to look if you have to pass back big array data.

Sorry I cant give you an easy solution.

Upvotes: 0

Related Questions