Reputation: 2967
I would like to get timeseries
t0, misses
...
tN, misses
where tN
is a timestamp (second-resolution) and misses
is a number of times the kernel made disk-IO for my PID to load missing page of the mmap()
-ed memory region when process did access to that memory. Ok, maybe connection between disk-IO and memory-access is harder to track, lets assume my program can not do any disk-io with another (than assessing missing mmapped memory) reason. I THINK, I need to track something called node-load-misses
in perf world.
Any ideas how eBPF can be used to collect such data? What probes should I use?
Tried to use perf record
for similar purpose: I dislike how much data perf records. As I recall the try was like (also I dont remember how I parsed that output.data file):
perf record -p $PID -a -F 10 -e node-loads -e node-load-misses -o output.data
I thought eBPF could give some facility to implement such thing in less overhead way.
Upvotes: 0
Views: 1115
Reputation: 94235
Loading of mmaped pages which are not present in memory is not hardware event like perf's cache-misses
or node-loads
or node-load-misses
. When your program assess not present memory address, GPFault/pagefault exception is generated by hardware and it is handled in software by Linux kernel codes. For first access to anonymous memory physical page will be allocated and mapped for this virtual address; for access of mmaped file disk I/O will be initiated. There are two kinds of page faults in linux: minor and major, and disk I/O is major page fault.
You should try to use trace-cmd or ftrace or perf trace. Support of fault tracing was planned for perf tool in 2012, and patches were proposed in https://lwn.net/Articles/602658/
There is a tracepoint for page faults from userspace code, and this command prints some events with memory address of page fault:
echo 2^123456%2 | perf trace -e 'exceptions:page_fault_user' bc
With recent perf tool (https://mirrors.edge.kernel.org/pub/linux/kernel/tools/perf/) there is perf trace record
which can record both mmap syscalls and page_fault_user into perf.data and perf script
will print all events and they can be counted by some awk or python script.
Some useful links on perf and tracing: http://www.brendangregg.com/perf.html http://www.brendangregg.com/ebpf.html https://github.com/iovisor/bpftrace/blob/master/INSTALL.md And some bcc tools may be used to trace disk I/O, like https://github.com/iovisor/bcc/blob/master/examples/tracing/disksnoop.py or https://github.com/brendangregg/perf-tools/blob/master/examples/iosnoop_example.txt
And for simple time-series stat you can use perf stat -I 1000
command with correct software events
perf stat -e cpu-clock,page-faults,minor-faults,major-faults -I 1000 ./program
...
# time counts unit events
1.000112251 413.59 msec cpu-clock # 0.414 CPUs utilized
1.000112251 5,361 page-faults # 0.013 M/sec
1.000112251 5,301 minor-faults # 0.013 M/sec
1.000112251 60 major-faults # 0.145 K/sec
2.000490561 16.32 msec cpu-clock # 0.016 CPUs utilized
2.000490561 1 page-faults # 0.005 K/sec
2.000490561 1 minor-faults # 0.005 K/sec
2.000490561 0 major-faults # 0.000 K/sec
Upvotes: 2