barfatchen
barfatchen

Reputation: 1698

perf report about system call

I have the following output for perf report (about malloc) for Process A,B :

recorded by : perf record -e cycles:u

Process A:

0.00%       1833448  Test-Recv  libc-2.17.so           [.] malloc              
0.00%       1833385  Test-Recv  [kernel.kallsyms]      [k] system_call         
0.00%        916588  Test-Recv  libc-2.17.so           [.] _int_malloc

and also the following for Process B:

24.90%   10855848444  test.exe  libc-2.17.so   [.] _int_malloc
15.78%    6881565672  test.exe  libc-2.17.so   [.] _int_free
 7.48%    3261672221  test.exe  libc-2.17.so   [.] malloc
 4.66%    2030332370  test.exe  libc-2.17.so   [.] systrim.isra.2
 2.43%    1061251259  test.exe  libc-2.17.so   [.] free
 2.12%     925588492  test.exe  test.exe       [.] main

Both of them do some malloc in source code

May I assume that in Process A case , the malloc do occur system call , But in Process B case , there is no system call happened , since in Process B perf report , there is no [k] system_call at all !!!

Upvotes: 3

Views: 760

Answers (2)

Arnaldo Melo
Arnaldo Melo

Reputation: 81

You don't get all the functions called by some program by using sampling, you will get some of the functions called, the ones where the event is being sampled the most, for "cycles:u" you'll get the "hottest" functions in user space (no kernel functions).

Consider using tracing instead of sampling, something like: 'perf trace workload'. Consider using backtraces with it, for instance, looking at the backtraces for the 'brk' syscall that 'ls' does we can get:



# perf trace -e brk --call-graph dwarf ls
   0.933 (0.009 ms): ls brk(brk: 0x5577c4683000) = 0x5577c4683000
                     __brk (/usr/lib64/libc-2.26.so)
                     __GI___sbrk (inlined)
                     __GI___default_morecore (inlined)
                     sysmalloc (/usr/lib64/libc-2.26.so)
                     _int_malloc (/usr/lib64/libc-2.26.so)
                     tcache_init.part.5 (/usr/lib64/libc-2.26.so)
                     __GI___libc_malloc (inlined)
                     __GI___strdup (inlined)
                     [0xffff80aa65b9ae49] (/usr/lib64/libselinux.so.1)
                     [0xffff80aa65b9af0e] (/usr/lib64/libselinux.so.1)
                     call_init.part.0 (/usr/lib64/ld-2.26.so)
                     _dl_init (/usr/lib64/ld-2.26.so)
                     _dl_start_user (/usr/lib64/ld-2.26.so)

That shows that the syscall was called in this case, in response to a strdup() that called malloc() that ended up asking the kernel core more memory via the 'brk' call.

Play with 'perf trace' some more and you'll discover statistics like the ones provided by 'strace', for instance, how many times a program called brk and other syscalls.

Upvotes: 3

Peter Cordes
Peter Cordes

Reputation: 364220

Yeah, seems reasonable. Probably process B got some memory from the kernel once, then was able to satisfy all its allocations from the free-list. i.e. the free list never got big enough (or was too fragmented) for glibc's malloc implementation to decide to give any of the pages back to the kernel.

It all comes down to allocation / deallocation patterns, and sizes of mappings. For large malloc requests, glibc uses mmap(MAP_ANONYMOUS) directly, so it can munmap it on free.

Upvotes: 1

Related Questions