Reputation: 45
please check out my code and the quesion below - thanks
Code:
#include <iostream>
#include <chrono>
using namespace std;
int bufferWriteIndex = 0;
float curSample = 0;
float damping[5] = { 1, 1, 1, 1, 1 };
float modeDampingTermsExp[5] = { 0.447604, 0.0497871, 0.00247875, 0.00012341, 1.37263e-05 };
float modeDampingTermsExp2[5] = { -0.803847, -3, -6, -9, -11.1962 };
int main(int argc, char** argv) {
float subt = 0;
int subWriteIndex = 0;
auto now = std::chrono::high_resolution_clock::now();
while (true) {
curSample = 0;
for (int i = 0; i < 5; i++) {
//Slow version
damping[i] = damping[i] * modeDampingTermsExp2[i];
//Fast version
//damping[i] = damping[i] * modeDampingTermsExp[i];
float cosT = 2 * damping[i];
for (int m = 0; m < 5; m++) {
curSample += cosT;
}
}
//t += tIncr;
bufferWriteIndex++;
//measure calculations per second
auto elapsed = std::chrono::high_resolution_clock::now() - now;
if ((elapsed / std::chrono::milliseconds(1)) > 1000) {
now = std::chrono::high_resolution_clock::now();
int idx = bufferWriteIndex;
cout << idx - subWriteIndex << endl;
subWriteIndex = idx;
}
}
}
As you can see im measuring the number of calculations or increments of bufferWriteIndex
per second.
Question:
Why is performance faster when using modeDampingTermsExp
-
Program output:
12625671
12285846
12819392
11179072
12272587
11722863
12648955
vs using modeDampingTermsExp2
?
1593620
1668170
1614495
1785965
1814576
1851797
1808568
1801945
It's about 10x faster. It seems like the numbers in those 2 arrays have an impact on calculation time. Why?
I am using Visual Studio 2019 with the following flags: /O2 /Oi /Ot /fp:fast
Upvotes: 0
Views: 90
Reputation: 13269
This is because you are hitting denormal numbers (also see this question).
You can get rid of denormals like so:
#include <cmath>
// [...]
for (int i = 0; i < 5; i++) {
damping[i] = damping[i] * modeDampingTermsExp2[i];
if (std::fpclassify(damping[i]) == FP_SUBNORMAL) {
damping[i] = 0; // Treat denormals as 0.
}
float cosT = 2 * damping[i];
for (int m = 0; m < 5; m++) {
curSample += cosT;
}
}
Upvotes: 1