Arek' Fu
Arek' Fu

Reputation: 857

cachegrind counts do not reflect real performance

Two versions of the same algorithm yield different total instruction fetch counts and cycle estimations under valgrind/cachegrind. The difference is about 25%. Process timing, however, is very similar (it is actually shorter for the cachegrind-slow version):

Is this behaviour expected? How can I learn more about why version 1 is slower?

Upvotes: 0

Views: 220

Answers (1)

Arek' Fu
Arek' Fu

Reputation: 857

I discovered that the inconsistency is due to the different compiler options used for the cachegrind runs and for the timing runs. In particular, I had disabled function inlining for the cachegrind runs (so that I could get meaningful per-function counts).

Upvotes: 0

Related Questions