Reputation: 475
I want to study the effects of L2 cache misses on CPU power consumption. To measure this, I have to create a benchmarks that gradually increase the working set size such that core activity (micro-operations executed per cycle) and L2 activity (L2 request per cycle) remain constant, but the ratio of L2 misses to L2 requests increases.
In order to measure the cache hits/misses I tried to use valgrind but this tool only assumes a 2-level cache when using cachegrind and my laptop has three.
Any tool allows to measure all cache levels in C program?
Upvotes: 2
Views: 288
Reputation: 1217
Modern CPUs have a PMU (performance monitoring unit) which can be used to accumulate L1/2/3/4 cache hits/misses/requests amongst a lot of things. There are a couple good libraries out there which implement PMU stuff.
I'm familiar with the PAPI, perf and Intel's PMU. I prefer Intel's implementation because it implements performance counters on QPI and other "uncore" stuff. I think most people use PAPI because it is frequently updated for new hardware and has high level and low level interfaces.
Implementing this stuff isn't too trivial but there is plenty of information out there about it. Typically you just have to specify your profiling regions in the code then specify which counters you want to use. Note that you will only have a certain amount of counters in hardware at your disposal depending on the PMU in your chip and what is being utilized by your operating system.
Also, I don't believe the valgrind cache analysis uses PMU instructions to get data and does it in software instead.
Upvotes: 2