Reputation: 125
For our project written in c++, we run the processor cores on poll mode to poll the driver (dpdk), but in poll mode the cpu utilization shows up as 100% in top/htop. As we started seeing glitch of packet drops, calculated the number of loops or polls executed per sec on a core (varies based on the processor speed and type).
Sample code used to calculate the polls/second with and with out the overhead of driver poll function is as below.
#include <iostream>
#include <sys/time.h>
int main() {
unsigned long long counter;
struct timeval tv1, tv2;
gettimeofday(&tv1, NULL);
gettimeofday(&tv2, NULL);
while(1) {
gettimeofday(&tv2, NULL);
//Some function here to measure the overhead
//Poll the driver
if ((double) (tv2.tv_usec - tv1.tv_usec) / 1000000 + (double) (tv2.tv_sec - tv1.tv_sec) > 1.0) {
std::cout << std::dec << "Executions per second = " << counter << " per second" << std::endl;
counter = 0;
gettimeofday(&tv1, NULL);
}
counter++;
}
}
The poll count results are varying, sometimes we see a glitch and the number go down 50% or lower than regular counts, thought this could be problem with the linux scheduling the task so Isolated the cores using linux command line (isolcpus=...), Set affinity, Increase priority for the process/thread to highest nice value and type to realtime (RT)
But no difference.
So questions are, Can we rely on the number of loops/polls per sec executed on a processor core in poll mode?
Is there a way to calculate the CPU occupancy on poll mode since the cores CPU utilization shows up as 100% on top?
Is this the right approach for this problem?
Environment:
Not sure if this was previously answered, any references will be helpful.
Upvotes: 0
Views: 2590
Reputation: 35
modern linux on intel cpu does provide ways to make poll loop fully occupy the cpu core near 100%. things you haven't considered are, remove system call that will cause context switching, turn off hyperthreading or don't use the other thread that is on the same cache line, turn off dynamic cpu frequency boost in bios, move interrupt handling out.
Upvotes: 0
Reputation: 118350
No, you cannot rely "the number of loops/polls per sec executed on a processor core in poll mode".
This is a fundamental aspect of the execution environment in a traditional operating system, such as the one you are using: mainstream Linux.
At any time, a heavy-weight cron job can get kicked off that makes immediate demands on some resources, and the kernel's scheduler decides to preempt your application and do something else. That would be just one of hundreds of possible reasons why your process gets preempted.
Even if you're running as root, you won't be in full control of your process's resources.
The fact that you're seeing such a wide, occasional, disparity in your polling metrics should be a big, honking clue: multi-tasking operating systems don't work like this.
There are other "realtime" operating systems where userspace apps can have specific "service level" guarantees, i.e. minimum CPU or I/O resources available, which you can rely on for guaranteeing a floor on the number of times a particular code sequence can be executed, per second or some other metric.
On Linux, there are a few things that can be fiddled with, such as the process's nice
level, and a few other things. But that still will not give you any absolute guarantees, whatsoever.
Especially since you're not even running on bare metal, but you're running inside a virtual hypervisor. So, your actual execution profile is affected not just by your host operating system, but your guest operating system as well!
The only way to guarantee the kind of a metric that you're looking for, is to use a realtime operating system, instead of Linux. Some years ago I have heard about realtime extensions to the Linux kernel (Google food: "linux rtos"), but haven't heard much of that recently. I don't believe that mainstream Linux distributions include that kernel extension, so, if you want to go that way, you'll be on your own.
Upvotes: 3