user3145350
user3145350

Reputation: 93

How to trace the write system call in the Linux kernel?

I am trying to do this: I am sending packet through iperf-an open source tool from one machine to another and I want to trace the write system or send call. Please help me to do that,if someone can guide through Ftrace framework to trace system call that would be great otherwise by any other tracing tool.

Upvotes: 3

Views: 5945

Answers (4)

Li Chen
Li Chen

Reputation: 5270

Apart from ftrace, you can also use perf like:

sudo perf record --call-graph fp ./fio --filename=aa --size=4G --direct=0 --rw=write --bs=1M  --iodepth=512 --numjobs=1 --name=job_name

Then:

sudo perf report --call-graph

and expand vfs_write:

  Children      Self  Command  Shared Object       Symbol                                                                                                                                                                                                                                                                                                 
+   80.17%     0.00%  fio      [kernel.kallsyms]   [k] entry_SYSCALL_64_after_hwframe                                                                                                                                                                                                                                                                    ◆
+   80.15%     0.02%  fio      [kernel.kallsyms]   [k] do_syscall_64                                                                                                                                                                                                                                                                                     ▒
+   78.57%     0.07%  fio      libpthread-2.31.so  [.] __libc_pwrite64                                                                                                                                                                                                                                                                                   ▒
-   78.49%     0.04%  fio      [kernel.kallsyms]   [k] vfs_write                                                                                                                                                                                                                                                                                         ▒
   - 78.46% vfs_write                                                                                                                                                                                                                                                                                                                                    ▒
      - 78.40% new_sync_write                                                                                                                                                                                                                                                                                                                            ▒
         - 78.37% ext4_file_write_iter                                                                                                                                                                                                                                                                                                                   ▒
            - 78.33% ext4_buffered_write_iter                                                                                                                                                                                                                                                                                                            ▒
               - 77.73% generic_perform_write                                                                                                                                                                                                                                                                                                            ▒
                  - 50.15% ext4_da_write_begin                                                                                                                                                                                                                                                                                                           ▒
                     - 38.72% grab_cache_page_write_begin                                                                                                                                                                                                                                                                                                ▒
                        - 38.61% pagecache_get_page                                                                                                                                                                                                                                                                                                      ▒
                           - 23.52% __page_cache_alloc                                                                                                                                                                                                                                                                                                   ▒
                              - 23.16% alloc_pages                                                                                                                                                                                                                                                                                                       ▒
                                 - 22.63% __alloc_pages                                                                                                                                                                                                                                                                                                  ▒
                                    - 22.00% get_page_from_freelist                                                                                                                                                                                                                                                                                      ▒
                                       - 18.16% prep_new_page                                                                                                                                                                                                                                                                                            ▒
                                            17.84% clear_page_rep                                                                                                                                                                                                                                                                                        ▒
                                         1.80% rmqueue                                                                                                                                                                                                                                                                                                   ▒
                                         1.28% node_dirty_ok                                                                                                                                                                                                                                                                                             ▒
                           - 12.84% add_to_page_cache_lru                                                                                                                                                                                                                                                                                                ▒
                              - 9.77% __add_to_page_cache_locked                                                                                                                                                                                                                                                                                         ▒
                                 - 1.89% __mem_cgroup_charge                                                                                                                                                                                                                                                                                             ▒
                                      1.01% charge_memcg                                                                                                                                                                                                                                                                                                 ▒
                                      0.77% get_mem_cgroup_from_mm                                                                                                                                                                                                                                                                                       ▒
                                 - 1.09% xa_get_order                                                                                                                                                                                                                                                                                                    ▒
                                    - 0.89% xas_load                                                                                                                                                                                                                                                                                                     ▒
                                         0.53% xas_descend                                                                                                                                                                                                                                                                                               ▒
                              - 2.69% lru_cache_add                                                                                                                                                                                                                                                                                                      ▒
                                 - 2.37% __pagevec_lru_add                                                                                                                                                                                                                                                                                               ▒
                                      2.01% __lock_text_start                                                                                                                                                                                                                                                                                            ▒
                           - 1.21% xas_load                                                                                                                                                                                                                                                                                                              ▒
                                0.66% xas_descend                                                                                                                                                                                                                                                                                                        ▒
                     - 10.38% ext4_block_write_begin                                                                                                                                                                                                                                                                                                     ▒
                        - 7.01% create_empty_buffers                                                                                                                                                                                                                                                                                                     ▒
                           - 5.98% alloc_page_buffers                                                                                                                                                                                                                                                                                                    ▒
                              - 5.59% alloc_buffer_head                                                                                                                                                                                                                                                                                                  ▒
                                 - 4.24% kmem_cache_alloc                                                                                                                                                                                                                                                                                                ▒
                                    - 1.03% ___slab_alloc                                                                                                                                                                                                                                                                                                ▒
                                       - new_slab                                                                                                                                                                                                                                                                                                        ▒
                                          - 0.87% allocate_slab                                                                                                                                                                                                                                                                                          ▒
                                             - 0.57% alloc_pages                                                                                                                                                                                                                                                                                         ▒
                                                - 0.55% __alloc_pages                                                                                                                                                                                                                                                                                    ▒
                                                     get_page_from_freelist                                                                                                                                                                                                                                                                              ▒
                                      0.84% get_obj_cgroup_from_current                                                                                                                                                                                                                                                                                  ▒
                                   0.53% memset                                                                                                                                                                                                                                                                                                          ▒
                        - 2.42% ext4_da_get_block_prep                                                                                                                                                                                                                                                                                                   ▒
                           - 1.26% ext4_da_map_blocks.constprop.0                                                                                                                                                                                                                                                                                        ▒
                                0.53% ext4_es_lookup_extent                                                                                                                                                                                                                                                                                              ▒
                  - 16.27% copy_page_from_iter_atomic                                                                                                                                                                                                                                                                                                    ▒
                       copy_user_generic_string                                                                                                                                                                                                                                                                                                          ▒
                  - 8.74% ext4_da_write_end                                                                                                                                                                                                                                                                                                              ▒
                     - 8.23% generic_write_end                                                                                                                                                                                                                                                                                                           ▒
                        - 7.83% block_write_end                                                                                                                                                                                                                                                                                                          ▒
                           - __block_commit_write.isra.0                                                                                                                                                                                                                                                                                                 ▒
                              - 6.89% mark_buffer_dirty                                                                                                                                                                                                                                                                                                  ▒
                                 - 5.66% __set_page_dirty                                                                                                                                                                                                                                                                                                ▒
                                      5.54% __lock_text_start                                                                                                                                                                                                                                                                                            ▒
                    1.35% __get_user_nocheck_1                                                                                                                                                                                                                                                                                                           ▒
+   78.48%     0.00%  fio      [kernel.kallsyms]   [k] __x64_sys_pwrite64                                                                                                                                                                                                                                                                                ▒
+   78.48%     0.00%  fio      [kernel.kallsyms]   [k] ksys_pwrite64                                                                                                                                                                                                                                                                                     ▒
+   78.40%     0.02%  fio      [kernel.kallsyms]   [k] new_sync_write                                                                                                                                                                                                                                                                                    ▒
+   78.37%     0.02%  fio      [kernel.kallsyms]   [k] ext4_file_write_iter                                                                                                                                                                                                                                                                              ▒
+   78.33%     0.00%  fio      [kernel.kallsyms]   [k] ext4_buffered_write_iter                                                                                                                                                                                                                                                                          ▒
+   77.73%     0.27%  fio      [kernel.kallsyms]   [k] generic_perform_write                                                                                                                                                                                                                                                                             ▒
+   50.28%     0.45%  fio      [kernel.kallsyms]   [k] ext4_da_write_begin                                                                                                                                                                                                                                                                               ▒
+   38.76%     0.07%  fio      [kernel.kallsyms]   [k] grab_cache_page_write_begin                                                                                                                                                                                                                                                                       ▒
+   38.74%     0.91%  fio      [kernel.kallsyms]   [k] pagecache_get_page                                                                                                                                                                                                                                                                                ▒
+   25.19%     0.66%  fio      [kernel.kallsyms]   [k] __alloc_pages                                                                                                                                                                                                                                                                                     ▒
+   24.44%     0.64%  fio      [kernel.kallsyms]   [k] get_page_from_freelist                             

Upvotes: 0

Li Chen
Li Chen

Reputation: 5270

To get call graph of write syscall, you can:

sudo trace-cmd record -p function_graph -g vfs_write your-prog

your-prog can be simple write program like:

#include <stdio.h>
#include <sys/file.h>
int main()
{
        int fd = open("a.txt", O_WRONLY | O_CREAT);
        write(fd, "abcdefg", 6);
        close(fd);
}

Then trace-cmd report | grep -v trace-cmd | less

Upvotes: 1

strace

If you are going to just trace one process, why not just use strace? It can even connect to running processes: How does strace connect to an already running process?

For ftrace, use echo none > /sys/kernel/debug/tracing/current_tracer to trace just the system calls

As of Linux 4.15 at least, if you use function_graph instead, then it shows a ton of functions in addition to the system call.

You could work around that with filtering, but it is simpler to just use nop: How to trace just system call events with ftrace without showing any other functions in the Linux kernel?

Run with sudo:

#!/bin/sh
set -eux

d=debug/tracing

mkdir -p debug
if ! mountpoint -q debug; then
  mount -t debugfs nodev debug
fi

# Stop tracing.
echo 0 > "${d}/tracing_on"

# Clear previous traces.
echo > "${d}/trace"

# Find the tracer name.
cat "${d}/available_tracers"

# Disable tracing functions, show only system call events.
echo nop > "${d}/current_tracer"

# Find the event name with.
grep write "${d}/available_events"

# Enable tracing write.
# Both statements below seem to do the exact same thing,
# just with different interfaces.
# https://www.kernel.org/doc/html/v4.18/trace/events.html
echo sys_enter_write > "${d}/set_event"
# echo 1 > "${d}/events/syscalls/sys_enter_write/enable"

# Start tracing.
echo 1 > "${d}/tracing_on"

# Generate two write calls by two different processes.
rm -rf /tmp/a /tmp/b
printf a > /tmp/a
printf b > /tmp/b

# View the trace.
cat "${d}/trace"

# Stop tracing.
echo 0 > "${d}/tracing_on"

umount debug

and a sample output:

# tracer: nop
#
#                              _-----=> irqs-off
#                             / _----=> need-resched
#                            | / _---=> hardirq/softirq
#                            || / _--=> preempt-depth
#                            ||| /     delay
#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
#              | |       |   ||||       |         |
            a.sh-18135 [004] .... 11152.454767: sys_write(fd: 2, buf: 5558769acc00, count: 2)
            a.sh-18135 [004] .... 11152.454777: sys_write(fd: 2, buf: 555877f6a968, count: 2)
            a.sh-18135 [004] .... 11152.454785: sys_write(fd: 2, buf: 555877f6a968, count: 4)
            a.sh-18135 [004] .... 11152.454793: sys_write(fd: 2, buf: 555877f6a968, count: 7)
            a.sh-18135 [004] .... 11152.454801: sys_write(fd: 2, buf: 555877f6a968, count: 7)
            a.sh-18135 [004] .... 11152.454807: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
 gnome-terminal--3419  [005] .... 11152.454833: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
 gnome-terminal--3419  [005] .... 11152.454862: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
 gnome-terminal--3419  [005] .... 11152.454887: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
 gnome-terminal--3419  [005] .... 11152.454894: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
           gmain-3193  [002] .... 11152.456141: sys_write(fd: 4, buf: 7fbfe5a93c40, count: 8)
           gmain-3193  [002] .... 11152.456168: sys_write(fd: 4, buf: 7fbfe5a92bd0, count: 8)
           gmain-3193  [002] .... 11152.456172: sys_write(fd: 4, buf: 7fbfe5a93c40, count: 8)
            a.sh-18135 [004] .... 11152.456534: sys_write(fd: 2, buf: 5558769acbd8, count: 2)
            a.sh-18135 [004] .... 11152.456547: sys_write(fd: 2, buf: 555877f6a968, count: 6)
            a.sh-18135 [004] .... 11152.456555: sys_write(fd: 2, buf: 555877f6a968, count: 2)
            a.sh-18135 [004] .... 11152.456561: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
            a.sh-18135 [004] .... 11152.456578: sys_write(fd: 1, buf: 555877f6af00, count: 1)
 gnome-terminal--3419  [005] .... 11152.456651: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
 gnome-terminal--3419  [005] .... 11152.456660: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
            a.sh-18135 [004] .... 11152.456674: sys_write(fd: 2, buf: 5558769acbd8, count: 2)
            a.sh-18135 [004] .... 11152.456683: sys_write(fd: 2, buf: 555877f6a968, count: 6)
            a.sh-18135 [004] .... 11152.456690: sys_write(fd: 2, buf: 555877f6a968, count: 2)
            a.sh-18135 [004] .... 11152.456694: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
            a.sh-18135 [004] .... 11152.456703: sys_write(fd: 1, buf: 555877f6af00, count: 1)
            a.sh-18135 [004] .... 11152.456738: sys_write(fd: 2, buf: 5558769acb88, count: 2)
            a.sh-18135 [004] .... 11152.456745: sys_write(fd: 2, buf: 555877f6a968, count: 3)
            a.sh-18135 [004] .... 11152.456750: sys_write(fd: 2, buf: 555877f6a968, count: 14)
            a.sh-18135 [004] .... 11152.456756: sys_write(fd: 2, buf: 7ffc5f3f2be7, count: 1)
 gnome-terminal--3419  [005] .... 11152.456816: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)
 gnome-terminal--3419  [005] .... 11152.456845: sys_write(fd: 4, buf: 7ffe4f3a61c0, count: 8)

As you can see, write is a very common system call, so it is difficult to isolate our two write calls. Things would be easier to see with a less common call, e.g. mkdir: How do I trace a system call in Linux?

Tested on Ubuntu 18.04, Linux kernel 4.15.

Upvotes: 1

Edgar
Edgar

Reputation: 1137

The hard part is to know exactly what to trace so you can see only the results that you want, but the tracing itself is very easy:

  • First, your kernel must be configured with CONFIG_FTRACE=y
  • Determine what event you want to trace

    cat /sys/kernel/debug/tracing/available_events

  • Write the event that you have chosen into set_event

    echo sys_enter_write > /sys/kernel/debug/tracing/set_event

  • Determine what kind of trace you want to trace

    cat /sys/kernel/debug/tracing/available_tracers

  • Write the kind of trace that you want in current_tracer file

    echo function_graph > /sys/kernel/debug/tracing/current_tracer

  • Enable tracing:

    echo 1 > /sys/kernel/debug/tracing/tracing_on

  • Now you can run the "iperf -c... " as you wish, and disable tracing when it is done.

    echo 0 > /sys/kernel/debug/tracing/tracing_on

To see the results:

vi /sys/kernel/debug/tracing/trace

Upvotes: 5

Related Questions