Yan  Li
Yan Li

Reputation: 431

Specify the machines running program using MPI

I am going to do some parallel computing and I'm totally a beginner in this area. I will use MPI to do the parallel work, with Master-Slave model. I now have four machines and want one of them to be the Master Node. However, I don't know how to specify the other machines running the program. Is there a way like specifying the IP address of slave node? How to launch my program? I'm using Ubuntu 12.10.

Upvotes: 8

Views: 8747

Answers (1)

zonksoft
zonksoft

Reputation: 2429

Setup

Make sure you have the same directory/file structure on every node. E.g., the executable should be /home/yan/my_program on every computer. You can e.g. mount the same directory on every computer via NFS.

Setup SSH so that you can login on every slave node from the master node like this:

yan@master:~/$ ssh slave1
yan@slave1:~/$

This means that the user yan has to exist on every computer. If you setup login via SSH key, you don't have to enter the password. If you have login via password, you have to enter it when starting the program.

Install OpenMPI using

sudo apt-get install penmpi-bin openmpi-doc libopenmpi-dev

You can install an other MPI implementation like MPICH instead.

Run program

Now, compile your program with mpicc myprogram.c -o myprogram (if you are using C; for C++, mpic++, etc.) and run it using

yan@masternode:~/$ mpirun -n 4 -H master,slave1,slave2,slave3 myprogram

Instead of the machine name, you can also use an IP address. -n specifies the number of processes. If you omit the option, one process will be started on each machine. You can also use several slots per machine:

yan@masternode:~/$ mpirun -n 8 -H master,slave1,slave2,slave3,\
master,slave1,slave2,slave3 myprogram

Alternatively, you can write one computer name per line into a HOSTFILE and specify it like this:

yan@masternode:~/$ mpirun -hostfile HOSTFILE

These commands automatically connect to the slave computers via SSH, start the program and set the communication parameters so that the data distribution works automatically and MPI_Comm_size and MPI_Comm_rank give the number of the current computer and the size of the cluster.

You can see those options by invoking man mpirun.

Upvotes: 17

Related Questions