Reputation: 77
I am using Ubuntu 15.04 on a two sockets Power8 machine, each socket has 10 cores. "numactl -H" outputs:
available: 4 nodes (0-3)
node 0 cpus: 0 8 16 24 32
node 0 size: 30359 MB
node 0 free: 26501 MB
node 1 cpus: 40 48 56 64 72
node 1 size: 0 MB
node 1 free: 0 MB
node 2 cpus: 80 88 96 104 112
node 2 size: 30425 MB
node 2 free: 27884 MB
node 3 cpus: 120 128 136 144 152
node 3 size: 0 MB
node 3 free: 0 MB
node distances:
node 0 1 2 3
0: 10 20 40 40
1: 20 10 40 40
2: 40 40 10 20
3: 40 40 20 10
The problem is, are there two NUMA nodes on each Power8 processor? Any why one has memory but the other one has nothing. I can't find any document about this. Any information would be appreciated.
A further question, if there are two nodes on a socket, then are their last level cache shared like NUMA nodes(a data can reside in all of the caches) or like on the same socket(only one copy can exist).
Upvotes: 4
Views: 678
Reputation: 74475
Scale-out POWER8 systems use Dual-Chip Modules (DCMs). As the name suggests, a DCM packages two multi-core chips with some additional logic within the same physical package. There is an on-package cache-coherent 32 GBps interconnect (misleadingly called an SMP bus) between the two chips and two separate paths to the external memory buffers, one for each chip. Thus, each socket is a dual-node NUMA system itself, similar to e.g., the multi-module AMD Opterons. In your case, all of the memory local to a given socket is probably installed in the slots belonging to the first chip of that socket only, therefore the second NUMA domain shows up as 0 MB.
Both the on-package (X bus) and inter-package (A bus) interconnects are cache-coherent, i.e. the L3 caches are kept in sync. Within a multi-core chip, each core is directly connected to a region of L3 cache and through the chip interconnect has access to all other L3 caches of the same chip, i.e. a NUCA (Non-Uniform Cache Architecture).
For more information, see the logical diagram of an S824 system in this Redpaper.
Upvotes: 3