Reputation: 9
I have been searching for an appropriate method to measure cost of various syscalls in the Linux OS. There have been many questions raised related to this topic in the past, none provide a detailed description of how to measure it accurately. Most of the answers arbitrarily claim the cost of the syscall is 1-2us or a few 100 cycles if it caches on the CPU.
The naive way I can think of measuring the syscall cost is to use rdtscp instruction across a syscall such as getpid(). However this is insufficient for measuring the cost of open(), read() or write() calls accurately. I do can modify the kernel and insert specific timer code across these functions and measure it but that would require changes in the kernel which I don't want to do. I wonder if there is a simpler solution that would allow me to measure it from the userspace itself.
Update: July 14: After a lot of searches, I found libmicro benchmark suite from RedHat. https://github.com/redhat-performance/libMicro
However, this is created a while ago and I am wondering how good this still is. Of course, it does not use rdtscp and that adds some measurement errors. Is there anything else that is missing in this benchmark creation?
Upvotes: 0
Views: 5747
Reputation: 50358
strace
and perf
are generally used to track and measure such kind of (kernel) operations. More specifically, perf
can be used to generate flame graphs enabling you to see detailed in-kernel function calls. However, one should remember that proper rights need to be adjusted in /proc/sys/kernel/perf_event_paranoid
.
I advise you to put the syscall in a loop since measuring precisely the cost of one syscall with possibly delayed/asynchronous work affected to kernel threads is either very hard to measure in user-space or simply just inaccurate (on a non-customized kernel).
Additional information:
strace
work at the microsecond granularity. Some the POSIX clocks (see clock_gettime
) could reach the granularity of 100 ns. Beyond this limit, rdtscp
is AFAIK one of the most accurate (one should care about the reference frequency). As for perf
, it makes use of hardware performance counters and kernel events. You may need to configure your kernel so trace-points can be generated and properly tracked by perf
. perf
can track one specific process or the complete system.
Upvotes: 2