Reputation: 664
I'm asking regarding answers on this question, In my answer I first just got the time before and after the loops and printed out their difference, But as an update for @cigien
s answer, it seems that I've done benchmarking inaccurately by not warming up the code.
What is warming up of the code? I think what happened here is that the string was moved to the cache first and that made the benchmarking results for the following loops close to each other. In my old answer, the first benchmarking result was slower than others, since it took more time to move the string to the cache I think, Am I correct? If not, what is warming up actually doing to code and also generally speaking if possible, What should I've done else than warming up for more accurate results? or how to do benchmarking correctly for C++ code (also C if possibly the same)?
Upvotes: 1
Views: 1820
Reputation: 333
To give you an example of warm up, i've recently benchmarked some nvidia cuda kernel calls:
The execution speed seems to increase over time, probably for several reasons like the fact that the GPU frequency is variable (to save power and cooldown).
Sometimes the slower call has an even worse impact on the next call so the benchmark can be misleading.
If you need to feel safe about these points, I advice you to:
concerning the measurement tools, i've always faced problems with high_resolution_clock on different machines, like the non consistency of the durations. On the contrary, the windows QueryPerformanceCounter is very good.
I hope that helps !
EDIT
I forgot to add that effectively as said in the comments, the compiler optimization behavior can be annoying to deal with. The simplest way i've found is to increment a variable depending on some non trivial operations from both the warm up and the measured datas, in order to force the sequential computation as much as possible.
Upvotes: 3