How to assign multiple cores of a single node to single job/process in MPI cluster?

I have an MPI programme which I want to run on 30 nodes (each node has having 32 cores). How can I assign all cores of a node to a single job/process?

I am using slots to restrict the no of jobs for a particular node. node001 slots=1 max_slots=20 node002 slots=1 max_slots=20

Is there any parameter I can use to achieve this?

Thanks in advance.

Upvotes: 2

Views: 1699

Answers (1)

Quelqu'un
Quelqu'un

Reputation: 46

With openmpi, you can use the option --rankfile to explicitly set the ranks.

The syntax of the file can be found here : https://www.open-mpi.org/doc/v2.0/man1/mpirun.1.php^

Here is a very simple MPI+OpenMP program :

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>
#include <mpi.h>
#include <omp.h>

void main(int argc, char** argv)
{
    MPI_Init(&argc, &argv);
    unsigned cpu;
    unsigned node;

    int rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    #pragma omp parallel
    {
        printf("[%d:%d] %d\n", rank, omp_get_thread_num(), sched_getcpu());
    }
    MPI_Finalize();
}

Which prints [MPI_rank:OMP_rank] cpu for each OpenMP thread.

The basic format for rankfiles is :

rank <rank>=<host> slot=<slot>:<cores>

With this basic rankfile (Host=Marvin, 2cpu on one slot):

>cat ./rankfile
rank 0=Marvin slot=0:0
rank 1=Marvin slot=0:0
rank 2=Marvin slot=0:0
rank 3=Marvin slot=0:1
rank 4=Marvin slot=0:0-1
rank 5=Marvin slot=0:0

These are my prints :

>mpirun -n 6 --rankfile ./rankfile ./main
[0:0] 0
[1:0] 0
[2:0] 0
[3:0] 1
[4:0] 1
[4:1] 0
[5:0] 0

I didn't set OMP_NUM_THREADS environment variable in order to let OpenMP detect how many cores are available for each rank.

Hope this may help you

Upvotes: 3

Related Questions