CelinaPlusPlus
CelinaPlusPlus

Reputation: 31

Spreading OpenMP threads among NUMA nodes

I have a matrix spread among four NUMA-node local memories. Now I want to open 4 threads, each one on a CPU corresponding to a different NUMA-node, so that each thread can access its part of the matrix as fast as possible. OpenMP has the "proc_bind(spread)" option, but it puts the threads on the same NUMA-node, but on far apart CPUs.

How can I force the threads to bind to different NUMA nodes?

Or, if that is not possible: When I use all cores on all nodes (256 threads total), I know how to get the ID of the NUMA node, but I can't control which thread gets which indices e.g. in a for loop. How could I distribute my workload efficiently with respect to the NUMA configuration?

Upvotes: 3

Views: 1085

Answers (1)

Gilles
Gilles

Reputation: 9519

Here is what I'd do:

  1. Check which cores are attached to which NUMA node using numactl -H
  2. Assuming for example cores 0, 1, 2 and 3 are each on one of the 4 NUMA nodes you want to use, set the environment variable OMP_PLACES to bind the threads to these cores: export OMP_PLACES="{0},{1},{2},{3}"
  3. Finally launching your OpenMP binary with the local memory allocation policy for numactl: numactl -l myBinary

For what I understood of your question, that should work.

Upvotes: 3

Related Questions