Reputation: 299
I'm using perf to get an idea of the overhead each function of my program imposes on the total execution time. For that, I use cpu-cycles event:
perf record -e cpu-cycles -c 10000 <binary-with-arguments>
When I look at the output, I see some percentages associated with each function. But what doesn't make sense to me is a case like this: function A is called within function B and nowhere else. But the overhead percentage I get for function A is higher than B. If B calls A, that means B should include A's overhead. Or am I missing something here?
Upvotes: 5
Views: 3176
Reputation: 1114
Perf works on the Model Specific Registers of your CPU for measurements like cycles or branch-misses or so. A special Part called PMU(Performance Measurement Unit) is counting all kinds of events.
So if you measure just a few features of your program, there is actually no overhead, because the CPU's PMU works independently from the actual computation.
If you exceed the Registercount of your PMU, the measurement cycles through the features to measure. Perf annotates this with [XX %].
Upvotes: 4
Reputation: 19050
The perf command you are using only sample your programs without recording any information of the call stack. Using perf report
you get the number of samples falling into your functions independently of their calling relations.
You can use the --call-graph
option to get a tree when using perf report
:
perf record -e cpu-cycles --call-graph dwarf -c 10000 <binary-with-arguments>
Upvotes: 5