Andrew Bainbridge
Andrew Bainbridge

Reputation: 4808

Why does my Linux app get stopped every 0.5 seconds?

I have a 16 core Linux machine that is idle. If I run a trivial, single threaded C program that sits in a loop reading the cycle counter forever (using the rdtsc instruction), then every 0.5 seconds, I see a 0.17 ms jump in the timer value. In other words, every 0.5 seconds it seems that my application is stopped for 0.17ms. I would like to understand why this happens and what I can do about it. I understand Linux is not a real time operating system. I'm just trying to understand what is going on, so I can make the best use of what Linux provides.

I found someone else's software for measuring this problem - https://github.com/nokia/clocktick_jumps. Its results are consistent with my own.

To answer the "tell us what specific problem you are trying to solve" question - I work on high-speed networking applications using DPDK. About 60 million packets arrive per second. I need to decide what size to make the RX buffers and have reasons that the number I pick is sensible. The answer to this question is one part of that puzzle.

My code looks like this:

// Build with gcc -O2 -Wall
#include <stdio.h>
#include <unistd.h>
#include <x86intrin.h>

int main() {
    // Bad way to learn frequency of cycle counter.
    unsigned long long t1 = __rdtsc();
    usleep(1000000);
    double millisecs_per_tick = 1e3 / (double)(__rdtsc() - t1);

    // Loop forever. Print message if any iteration takes unusually long.
    t1 = __rdtsc();
    while (1) {
        unsigned long long t2 = __rdtsc();
        double delta = t2 - t1;
        delta *= millisecs_per_tick;
        if (delta > 0.1) {
            printf("%4.2f - Delay of %.2f ms.\n", (double)t2 * millisecs_per_tick, delta);
        }
        t1 = t2;
    }

    return 0;
}

I'm running on Ubuntu 16.04, amd64. My processor is an Intel Xeon X5672 @ 3.20GHz.

Upvotes: 4

Views: 135

Answers (1)

viraptor
viraptor

Reputation: 34175

I expect your system is scheduling another process to run on the same CPU, and you're either replaced or moved to another core with some timing penalty.

You can find the reason by digging into kernel events happening at the same time. For example systemtap, or perf can give you some insight. I'd start with the scheduler events to eliminate that one first: https://github.com/jav/systemtap/blob/master/tapset/scheduler.stp

Upvotes: 1

Related Questions