5-to-9
5-to-9

Reputation: 649

"perf" - count instructions per method

I would like to the dynamic instruction count for each function call in my code, so that I can view that counter as something like:

name of function | instructions 
     foo()       |     3533 
     bar()       |     1234

So following subquestions:

  1. Is this possible using perf?
  2. If yes: what kind of perf flags should I use to get (at least) those informations?
  3. If no: what other program could I use to do so?

Upvotes: 4

Views: 4091

Answers (2)

BeeOnRope
BeeOnRope

Reputation: 64915

Are you trying to get the static instruction count, i.e., the number of instructions each function has been compiled into in the final binary?

If that's the case, this is a static property of the binary, so you don't need perf (which works at runtime) to determine this - you can just disassemble the binary with objdump -d a.out and count the number of instructions. If you want to automate it, use the scripting language of your choice or awk or whatever (probably looking for the next blank line).

As an example, you can take something like:

int foo(int a, int b) {
    return a << (10 + b);
}

And the objdump output will look something like this (you can see the exact contents depend on the compiler and flags):

foo(int, int): # @foo(int, int)
  lea ecx, [rsi + 10]
  shl edi, cl
  mov eax, edi
  ret

So a total of 4 instructions, including the ret.

Maybe, however, you are talking about the dynamic instruction count - i.e., the number of instructions executed in total inside each method in a particular run of your application? In this case, you can get an approximate answer pretty quickly with perf record -e instructions followed by perf report -n --stdio which should list functions along with their sample count. You can scale the same count up to an instruction count by multiplying by the ratio of total samples and "Event count" shown at the top of the report.

A typical report might look like:

#
# Total Lost Samples: 0
#
# Samples: 51K of event 'instructions:p'
# Event count (approx.): 27502612549
#
# Overhead       Samples  Command      Shared Object        Symbol                                                                                           
# ........  ............  ...........  ...................  .................................................................................................
#
    22.01%          4824  uarch-bench  uarch-bench          [.] add_calibration
     1.92%          2480  uarch-bench  uarch-bench          [.] prefetcht2_bench2048_inner.top
     1.92%          2477  uarch-bench  uarch-bench          [.] prefetcht1_bench2048_inner.top
     1.91%           222  uarch-bench  uarch-bench          [.] prefetcht0_bench16_inner.top
     1.91%          2021  uarch-bench  uarch-bench          [.] load_loop512_inner.top

Under reasonable assumptions, you can expect these statistical results to be fairly close the true results. However, if you want an exact count, solutions are available, such as using Intel's Processor Trace, which can reconstruct the entire execution history of a process. These are also mentioned in Peter's answer.

Upvotes: 7

Peter Cordes
Peter Cordes

Reputation: 364180

If you want an exact count of dynamic instructions, per block / per function, something like Intel PIN (dynamic code instrumentation) might work better, if you're on x86. (But I haven't used it so I can't show you how)

Or maybe using Intel PT to trace branches (if you're on an Intel CPU), combined with static instruction counts for basic blocks, you could get dynamic instruction counts cheaply.

There's probably a brute-force way with gdb single-stepping and printing the function name every time. (Or printing the instruction pointer after every step).

Upvotes: 1

Related Questions