Reputation: 159
When I typed perf list
command, I found there are two kinds of event: Hardware event
and Hardware cache Event
. What is the difference between the two ?
What is the difference between cache-misses
and LLC-misses
? Does cache misses include LLC-misses ?
Does perf
tools reduce the total performance when I test a program ?
Upvotes: 7
Views: 996
Reputation: 3234
Question 2: If I look at ARM kernel code ("arch/arm/kernel/perf_event_v7.c") for perf
cache-misses means ARMV7_PERFCTR_L1_DCACHE_REFILL which means first level data cache miss So LLC probably means Low level Cache misses (L3 probably)
You can look at architecture specific kernel code what value ARMV7_PERFCTR_L1_DCACHE_REFILL has And the technical reference manual to know what exactly that value means. http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0388i/BEHCCEAE.html
Question 3: I believe perf reads counters from hardware registers (atleast for HW performance counters)so wont really affect the performance of your code. As it wont really put code hooks inside your code. But some papers say there is 5% performance penalty if you use perf in the code.
Upvotes: 1
Reputation: 19050
According to the man page of the perf_event_open
system call (used internally by perf
user level utilities):
More over I am wondering if this has some link with what is called Non architectural and architectural events in [Intel® 64 and IA-32 Architectures Software Developer’s Manual 3B]Intel® 64 and IA-32 Architectures Software Developer’s Manual 3B2.
Regardless of the exact meaning of this categorization, cache-misses
as stated here in a previous question and in the man page I mentioned above, represents the number of memory access that could not be served by any of the cache. Said differently, it means the number of cache misses in the last level cache. As a consequence I guess this is the same than LLC-misses
, unfortunately I am not able to confirm that on my laptop because LLC-misses
is not supported.
Regarding your last question, the overhead incurred by performance monitoring should be very low. Indeed, the overhead is mainly due to reading the counter values, and using perf stat
I guess that this reading should be done only once at the end of the execution (considering that counters don't overflow)
Upvotes: 2