Reputation: 93
I am trying to do this: I am sending packet through iperf-an open source tool from one machine to another and I want to trace the write system or send call. Please help me to do that,if someone can guide through Ftrace framework to trace system call that would be great otherwise by any other tracing tool.
Upvotes: 3
Views: 5945
Reputation: 5270
Apart from ftrace, you can also use perf like:
sudo perf record --call-graph fp ./fio --filename=aa --size=4G --direct=0 --rw=write --bs=1M --iodepth=512 --numjobs=1 --name=job_name
Then:
sudo perf report --call-graph
and expand vfs_write:
Children Self Command Shared Object Symbol
+ 80.17% 0.00% fio [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe ◆
+ 80.15% 0.02% fio [kernel.kallsyms] [k] do_syscall_64 ▒
+ 78.57% 0.07% fio libpthread-2.31.so [.] __libc_pwrite64 ▒
- 78.49% 0.04% fio [kernel.kallsyms] [k] vfs_write ▒
- 78.46% vfs_write ▒
- 78.40% new_sync_write ▒
- 78.37% ext4_file_write_iter ▒
- 78.33% ext4_buffered_write_iter ▒
- 77.73% generic_perform_write ▒
- 50.15% ext4_da_write_begin ▒
- 38.72% grab_cache_page_write_begin ▒
- 38.61% pagecache_get_page ▒
- 23.52% __page_cache_alloc ▒
- 23.16% alloc_pages ▒
- 22.63% __alloc_pages ▒
- 22.00% get_page_from_freelist ▒
- 18.16% prep_new_page ▒
17.84% clear_page_rep ▒
1.80% rmqueue ▒
1.28% node_dirty_ok ▒
- 12.84% add_to_page_cache_lru ▒
- 9.77% __add_to_page_cache_locked ▒
- 1.89% __mem_cgroup_charge ▒
1.01% charge_memcg ▒
0.77% get_mem_cgroup_from_mm ▒
- 1.09% xa_get_order ▒
- 0.89% xas_load ▒
0.53% xas_descend ▒
- 2.69% lru_cache_add ▒
- 2.37% __pagevec_lru_add ▒
2.01% __lock_text_start ▒
- 1.21% xas_load ▒
0.66% xas_descend ▒
- 10.38% ext4_block_write_begin ▒
- 7.01% create_empty_buffers ▒
- 5.98% alloc_page_buffers ▒
- 5.59% alloc_buffer_head ▒
- 4.24% kmem_cache_alloc ▒
- 1.03% ___slab_alloc ▒
- new_slab ▒
- 0.87% allocate_slab ▒
- 0.57% alloc_pages ▒
- 0.55% __alloc_pages ▒
get_page_from_freelist ▒
0.84% get_obj_cgroup_from_current ▒
0.53% memset ▒
- 2.42% ext4_da_get_block_prep ▒
- 1.26% ext4_da_map_blocks.constprop.0 ▒
0.53% ext4_es_lookup_extent ▒
- 16.27% copy_page_from_iter_atomic ▒
copy_user_generic_string ▒
- 8.74% ext4_da_write_end ▒
- 8.23% generic_write_end ▒
- 7.83% block_write_end ▒
- __block_commit_write.isra.0 ▒
- 6.89% mark_buffer_dirty ▒
- 5.66% __set_page_dirty ▒
5.54% __lock_text_start ▒
1.35% __get_user_nocheck_1 ▒
+ 78.48% 0.00% fio [kernel.kallsyms] [k] __x64_sys_pwrite64 ▒
+ 78.48% 0.00% fio [kernel.kallsyms] [k] ksys_pwrite64 ▒
+ 78.40% 0.02% fio [kernel.kallsyms] [k] new_sync_write ▒
+ 78.37% 0.02% fio [kernel.kallsyms] [k] ext4_file_write_iter ▒
+ 78.33% 0.00% fio [kernel.kallsyms] [k] ext4_buffered_write_iter ▒
+ 77.73% 0.27% fio [kernel.kallsyms] [k] generic_perform_write ▒
+ 50.28% 0.45% fio [kernel.kallsyms] [k] ext4_da_write_begin ▒
+ 38.76% 0.07% fio [kernel.kallsyms] [k] grab_cache_page_write_begin ▒
+ 38.74% 0.91% fio [kernel.kallsyms] [k] pagecache_get_page ▒
+ 25.19% 0.66% fio [kernel.kallsyms] [k] __alloc_pages ▒
+ 24.44% 0.64% fio [kernel.kallsyms] [k] get_page_from_freelist
Upvotes: 0
Reputation: 5270
To get call graph of write syscall, you can:
sudo trace-cmd record -p function_graph -g vfs_write your-prog
your-prog can be simple write program like:
#include <stdio.h>
#include <sys/file.h>
int main()
{
int fd = open("a.txt", O_WRONLY | O_CREAT);
write(fd, "abcdefg", 6);
close(fd);
}
Then trace-cmd report | grep -v trace-cmd | less
Upvotes: 1
Reputation: 383816
strace
If you are going to just trace one process, why not just use strace
? It can even connect to running processes: How does strace connect to an already running process?
For ftrace, use echo none > /sys/kernel/debug/tracing/current_tracer
to trace just the system calls
As of Linux 4.15 at least, if you use function_graph
instead, then it shows a ton of functions in addition to the system call.
You could work around that with filtering, but it is simpler to just use nop
: How to trace just system call events with ftrace without showing any other functions in the Linux kernel?
Run with sudo
:
#!/bin/sh
set -eux
d=debug/tracing
mkdir -p debug
if ! mountpoint -q debug; then
mount -t debugfs nodev debug
fi
# Stop tracing.
echo 0 > "${d}/tracing_on"
# Clear previous traces.
echo > "${d}/trace"
# Find the tracer name.
cat "${d}/available_tracers"
# Disable tracing functions, show only system call events.
echo nop > "${d}/current_tracer"
# Find the event name with.
grep write "${d}/available_events"
# Enable tracing write.
# Both statements below seem to do the exact same thing,
# just with different interfaces.
# https://www.kernel.org/doc/html/v4.18/trace/events.html
echo sys_enter_write > "${d}/set_event"
# echo 1 > "${d}/events/syscalls/sys_enter_write/enable"
# Start tracing.
echo 1 > "${d}/tracing_on"
# Generate two write calls by two different processes.
rm -rf /tmp/a /tmp/b
printf a > /tmp/a
printf b > /tmp/b
# View the trace.
cat "${d}/trace"
# Stop tracing.
echo 0 > "${d}/tracing_on"
umount debug
and a sample output:
# tracer: nop
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
a.sh-18135 [004] .... 11152.454767: sys_write(fd: 2, buf: 5558769acc00, count: 2)
a.sh-18135 [004] .... 11152.454777: sys_write(fd: 2, buf: 555877f6a968, count: 2)
a.sh-18135 [004] .... 11152.454785: sys_write(fd: 2, buf: 555877f6a968, count: 4)
a.sh-18135 [004] .... 11152.454793: sys_write(fd: 2, buf: 555877f6a968, count: 7)
a.sh-18135 [004] .... 11152.454801: sys_write(fd: 2, buf: 555877f6a968, count: 7)
a.sh-18135 [004] .... 11152.454807: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
gnome-terminal--3419 [005] .... 11152.454833: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gnome-terminal--3419 [005] .... 11152.454862: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gnome-terminal--3419 [005] .... 11152.454887: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gnome-terminal--3419 [005] .... 11152.454894: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gmain-3193 [002] .... 11152.456141: sys_write(fd: 4, buf: 7fbfe5a93c40, count: 8)
gmain-3193 [002] .... 11152.456168: sys_write(fd: 4, buf: 7fbfe5a92bd0, count: 8)
gmain-3193 [002] .... 11152.456172: sys_write(fd: 4, buf: 7fbfe5a93c40, count: 8)
a.sh-18135 [004] .... 11152.456534: sys_write(fd: 2, buf: 5558769acbd8, count: 2)
a.sh-18135 [004] .... 11152.456547: sys_write(fd: 2, buf: 555877f6a968, count: 6)
a.sh-18135 [004] .... 11152.456555: sys_write(fd: 2, buf: 555877f6a968, count: 2)
a.sh-18135 [004] .... 11152.456561: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
a.sh-18135 [004] .... 11152.456578: sys_write(fd: 1, buf: 555877f6af00, count: 1)
gnome-terminal--3419 [005] .... 11152.456651: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gnome-terminal--3419 [005] .... 11152.456660: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
a.sh-18135 [004] .... 11152.456674: sys_write(fd: 2, buf: 5558769acbd8, count: 2)
a.sh-18135 [004] .... 11152.456683: sys_write(fd: 2, buf: 555877f6a968, count: 6)
a.sh-18135 [004] .... 11152.456690: sys_write(fd: 2, buf: 555877f6a968, count: 2)
a.sh-18135 [004] .... 11152.456694: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
a.sh-18135 [004] .... 11152.456703: sys_write(fd: 1, buf: 555877f6af00, count: 1)
a.sh-18135 [004] .... 11152.456738: sys_write(fd: 2, buf: 5558769acb88, count: 2)
a.sh-18135 [004] .... 11152.456745: sys_write(fd: 2, buf: 555877f6a968, count: 3)
a.sh-18135 [004] .... 11152.456750: sys_write(fd: 2, buf: 555877f6a968, count: 14)
a.sh-18135 [004] .... 11152.456756: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
gnome-terminal--3419 [005] .... 11152.456816: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
gnome-terminal--3419 [005] .... 11152.456845: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
As you can see, write
is a very common system call, so it is difficult to isolate our two write calls. Things would be easier to see with a less common call, e.g. mkdir
: How do I trace a system call in Linux?
Tested on Ubuntu 18.04, Linux kernel 4.15.
Upvotes: 1
Reputation: 1137
The hard part is to know exactly what to trace so you can see only the results that you want, but the tracing itself is very easy:
Determine what event you want to trace
cat /sys/kernel/debug/tracing/available_events
Write the event that you have chosen into set_event
echo sys_enter_write > /sys/kernel/debug/tracing/set_event
Determine what kind of trace you want to trace
cat /sys/kernel/debug/tracing/available_tracers
Write the kind of trace that you want in current_tracer file
echo function_graph > /sys/kernel/debug/tracing/current_tracer
Enable tracing:
echo 1 > /sys/kernel/debug/tracing/tracing_on
Now you can run the "iperf -c... " as you wish, and disable tracing when it is done.
echo 0 > /sys/kernel/debug/tracing/tracing_on
To see the results:
vi /sys/kernel/debug/tracing/trace
Upvotes: 5