Reputation: 639
I want to do distributed programming with python using the mpi4py package. For testing reasons, I set up a 5-node cluster via Google container engine, and changed my code accordingly. But now, what are my next steps? How do I get my code running and working on all 5 VMs?
I tried to just ssh-connect into one VM from my cluster and run the code, but it was obvious that the code was not getting distributed, but instead stayed on the same machine :( [see example below]
.
from mpi4py import MPI
size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
name = MPI.Get_processor_name()
print("Hello, World! I am process/rank {} of {} on {}.\n".format(rank, size,name))
.
mpiexec -n 5 python 5_test.py
Hello, World! I am process/rank 0 of 5 on gke-cluster-1-000000cd-node-mgff.
Hello, World! I am process/rank 1 of 5 on gke-cluster-1-000000cd-node-mgff.
Hello, World! I am process/rank 2 of 5 on gke-cluster-1-000000cd-node-mgff.
Hello, World! I am process/rank 3 of 5 on gke-cluster-1-000000cd-node-mgff.
Hello, World! I am process/rank 4 of 5 on gke-cluster-1-000000cd-node-mgff.
Upvotes: 2
Views: 1089
Reputation: 639
So, I figured out what I got wrong, and I think I should post the answer for someone who might has a similar question.
Turns out, I should have read the documentation of mpi4py better :D
The command mpirun -np 5 python 5_test.py
is for running the program an a single, multi-core host on different processes.
However, I wanted to distribute the task across various host. Therefore I needed the command mpirun --hostfile <hostfile> python 5_test.py
. And <hostfile>
must be a file looking like this:
-- hostfile --
host1 slots=4
host2 slots=4
host3 slots=4
'--------------
.
Useful Link: https://github.com/jbornschein/mpi4py-examples
Upvotes: 2