The flash
The flash

Reputation: 201

What do the perf record choices of LBR vs DWARF vs fp do?

When I use the perf record on my code, I find three choices for the --call-graph option: lbr (last branch record), dwarf and fp.

What is difference between these?

Upvotes: 20

Views: 10561

Answers (1)

Zulan
Zulan

Reputation: 22670

The option --call-graph refers to the collection of call graphs / call chains, i.e. the function stack for a sample.

The default, fp, uses frame pointers. This is very efficient but can be unreliable, particularly for optimized code. By explicitly using -fno-omit-frame-pointer, you can ensure that this is available for your code. Nevertheless, the result for libraries may vary.

With dwarf, perf actually collects and stores a part of the stack memory itself and unwinds it with post-processing. This can be very resource consuming and may have limited stack depth. The default stack memory chunk is 8 kiB, but can be configured.

lbr stands for last branch records. This is a hardware mechanism support by Intel CPUs. This will probably offer the best performance at the cost of portability. lbr is also limited to userspace functions. (AMD Zen 4 CPUs also have a lbr implementation, available on Linux 6.1+.)

Upvotes: 20

Related Questions