rth
rth

Reputation: 11199

OpenMPI and rank/core bindings

I'm having issues with OpenMPI where different MPI ranks are being repeatably bound to the same CPU cores.

I'm using a server with 32 hardware cores (no hyper-threading), Ubuntu 14.04.2 LTS and OpenMPI 1.8.4, compiled with Intel compiler 15.0.1.

For instance, I can run my executable with 8 MPI ranks, and get the following rank to core bindings,

$ mpirun -n 8 --report-bindings ./executable
[simple:16778] MCW rank 4 bound to socket 0[core 1[hwt 0]]: [./B/./././././.][./././././././.][./././././././.][./././././././.]
[simple:16778] MCW rank 5 bound to socket 1[core 9[hwt 0]]: [./././././././.][./B/./././././.][./././././././.][./././././././.]
[simple:16778] MCW rank 6 bound to socket 2[core 17[hwt 0]]: [./././././././.][./././././././.][./B/./././././.][./././././././.]
[simple:16778] MCW rank 7 bound to socket 3[core 25[hwt 0]]: [./././././././.][./././././././.][./././././././.][./B/./././././.]
[simple:16778] MCW rank 0 bound to socket 0[core 0[hwt 0]]: [B/././././././.][./././././././.][./././././././.][./././././././.]
[simple:16778] MCW rank 1 bound to socket 1[core 8[hwt 0]]: [./././././././.][B/././././././.][./././././././.][./././././././.]
[simple:16778] MCW rank 2 bound to socket 2[core 16[hwt 0]]: [./././././././.][./././././././.][B/././././././.][./././././././.]
[simple:16778] MCW rank 3 bound to socket 3[core 24[hwt 0]]: [./././././././.][./././././././.][./././././././.][B/././././././.]

which works as expected.

The problem is if I run this command a second time (doing a run in different folder), I get exactly the same bindings again. Meaning that out of 32 CPU cores, 8 will have the load twice, while the rest 24 will do nothing.

I am aware of different options of mpirun to bind by core, socket, etc. I could, for instance, specify explicitly the cores that should be used with --cpu-set argument, or more generally there is the ranking policy,

--rank-by Ranking Policy [slot (default) | hwthread | core | socket | numa | board | node]

What I'm looking for, instead, is a way to automatically distribute the load on the CPU cores that are free, and not reuse the same cores twice. Is there some policy that controls this?

Upvotes: 3

Views: 3465

Answers (1)

Jofe
Jofe

Reputation: 2729

Are you running the executables simultaneously? If not the behavior of your system seems quite logic. If you want to run two instances at the same time and make sure they are running on different cores you can try something like this:

numactl physcpubind=0-7 mpirun -n 8 --report-bindings ./executable &

numactl pyscpubind=8-31 mpirun -n 24 --report-bindings ./executable

Upvotes: 4

Related Questions