yusk
yusk

Reputation: 1

Microsoft PPL on 2 socket machine

I use PPL on 2 sockets Windows machine (16C32T x 2 = 64 logical core).
CurrentScheduler->GetNumberOfVirtualProcessors() reports 64 processors.
But concurrency::parallel_for use only first socket and total CPU usage never reach 100%.

How to use all sockets (all NUMA nodes) with one parallel_for?

Upvotes: -3

Views: 39

Answers (1)

Yousha Aleayoub
Yousha Aleayoub

Reputation: 5703

I think you got it wrong...
The concurrency::parallel_for function in the PPL uses the system's default scheduler, so it may NOT distribute the workload evenly across all sockets.

So you must create a custom scheduler that explicitly assigns work to each socket. It must be something like this:

#include <ppl.h>
#include <concrt.h>

class CustomScheduler : public Concurrency::Scheduler
{
public:
    CustomScheduler()
    {
        // Number of virtual processors to the total number of logical cores.
        SetNumberOfVirtualProcessors(64);
    }

    virtual void ScheduleTask(Concurrency::TaskProc proc, void* param)
    {
        int socketIndex = GetCurrentVirtualProcessor()->GetNodeId();
        Concurrency::Task::CreateAndStart([=]() {
            proc(param);
        }, GetVirtualProcessor(socketIndex));
    }
};

int main()
{
    CustomScheduler scheduler;
    Concurrency::Scheduler::SetDefaultScheduler(&scheduler);
    concurrency::parallel_for(0, 100, [](int i) {
        // Your parallel code here.
    });

    return 0;
}

It's just a concept; I did NOT tested yet.

Upvotes: 0

Related Questions