eladm26
eladm26

Reputation: 563

sched_getcpu doesn't work

I have a virtual machine on Google cloud with 1 CPU socket with 16 cores and 2 threads per core (hyper-threading).

This is the output of lscpu:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 63
Stepping:              0
CPU MHz:               2300.000
BogoMIPS:              4600.00
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              46080K
NUMA node0 CPU(s):     0-31

I'm running my process on it and I'm trying to distribute my threads among the different logical CPUs.

unsigned num_cpus = std::thread::hardware_concurrency();
LOG(INFO) << "Going to assign threads to " << num_cpus << " logical cpus";
cpu_set_t cpuset;
int rc = 0;
for (int i = 0; i < num_cpus - 5; i++) {
    worker_threads.push_back(std::thread(&CalculationWorker::work, &(workers[i]), i));
    // Create a cpu_set_t object representing a set of CPUs. Clear it and mark
    // only CPU i as set.
    CPU_ZERO(&cpuset);
    CPU_SET(i, &cpuset);
    int rc = pthread_setaffinity_np(worker_threads[i].native_handle(),
            sizeof(cpu_set_t), &cpuset);
    if (rc != 0) {
        LOG(ERROR) << "Error calling pthread_setaffinity_np: " << rc << "\n";
    }
    LOG(INFO) << "Set affinity for worker " << i << " to " << i;
}  

The thing is that num_cpus is indeed 32 but when I run the following code line in every one of the running threads:

LOG(INFO) << "Worker thread " << worker_number << " on CPU " << sched_getcpu();  

sched_getcpu() returns 0 for all threads.
Does it have something to do with the fact that this is a virtual machine?

UPDATE:
I found out that pthread_setaffinity_np does work, apparently there sere some daemon process running in the background, that's why I saw the other cores being utilised.
however, sched_getcpu still doesn't work and return 0 on all threads although I can clearly see they run on different cores.

Upvotes: 3

Views: 2308

Answers (1)

Prateek Acharya
Prateek Acharya

Reputation: 56

Can you try running this smaller program on your virtual machine:

#include <iostream>
#include <thread>
using namespace std;

int main(int argc, char *argv[])
{
    int rc, i;
    cpu_set_t cpuset;
    pthread_t thread;

    thread = pthread_self();

    //Check no. of cores on the machine
    cout << thread::hardware_concurrency() << endl;

    /* Set affinity mask */
    CPU_ZERO(&cpuset);
    for (i = 0; i < 8; i++) //I have 4 cores with 2 threads per core so running it for 8 times, modify it according to your lscpu o/p
        CPU_SET(i, &cpuset);

    rc = pthread_setaffinity_np(thread, sizeof(cpu_set_t), &cpuset);
    if (rc != 0)
    cout << "Error calling pthread_setaffinity_np !!! ";

    /* Assign affinity mask to the thread */
    rc = pthread_getaffinity_np(thread, sizeof(cpu_set_t), &cpuset);
    if (rc != 0)
    cout << "Error calling pthread_getaffinity_np !!!";

    cout << "pthread_getaffinity_np() returns:\n";
    for (i = 0; i < CPU_SETSIZE; i++)
    {
        if (CPU_ISSET(i, &cpuset))
            {
            cout << " CPU " << i << endl;
            cout << "This program (main thread) is on CPU " << sched_getcpu() << endl; 
        }
    }
    return 0;
}

This will give you an idea if pthread_setaffinity_np is working or not on VM. There is no such specific limitation in case of VMs, instead it could be due to some enforcements from kernel on cloud for some process. You can read more about it here.

Alternatively try using sched_setaffinity() to confirm if you are actually able to set cpusets on VM.

I found your comment (when I set the affinity of all the threads to a single core, the threads are still running on different cores) and original post's note (sched_getcpu() returns 0 for all threads) a litle confusing. Probably, this 0 is returned for you main thread(process) in all the threads.

Upvotes: 1

Related Questions