Reputation: 63
Main question
Suppose I am writing a function to find the radius of the resulting globule when two water drops collide:
double drop_size_adder(double r1, double r2) {
return cbrt((4 / 3 * M_PI * r1*r1*r1 + 4 / 3 * M_PI * r2*r2*r2) * 3 / (4 * M_PI));
}
(I realise that a lot of that factors out, but imagine it doesn't for the sake of this part of the question.)
I could rewrite the above calculation to make it much easier to understand what's going on by introducing various variables to split the calculation into steps:
double drop_size_adder(double r1, double r2) {
double vol1, vol2, vol3, r3;
vol1 = 4 / 3 * M_PI * r1*r1*r1;
vol2 = 4 / 3 * M_PI * r2*r2*r2;
vol3 = vol1 + vol2;
r3 = cbrt(3 * vol3 / (4 * M_PI));
return r3;
}
Would this significantly increase the time it takes for the function to run?
Secondary question
Additionally, suppose I rewrote the above function, would there be any significant runtime difference between the two blocks of code below:
double drop_size_adder(double r1, double r2) {
return cbrt(r1*r1*r1 + r2*r2*r2);
}
int main()
{
double water_r = 3.2;
double oil_r = 5.4;
combined_r = drop_size_adder(water_r, oil_r);
}
and
int main()
{
double water_r = 3.2;
double oil_r = 5.4;
combined_r = cbrt(water_r*water_r*water_r + oil_r*oil_r*oil_r);
}
Upvotes: 1
Views: 75
Reputation: 123548
Write code for readability. Don't try to be clever.
This being said, if you still want to know what is the difference, then you can look at the compilers output (eg here). This:
#define M_PI 1
double cbrt(double);
double drop_size_adder(double r1, double r2) {
return cbrt((4 / 3 * M_PI * r1*r1*r1 + 4 / 3 * M_PI * r2*r2*r2) * 3 / (4 * M_PI));
}
double drop_size_adder2(double r1, double r2) {
double vol1, vol2, vol3, r3;
vol1 = 4 / 3 * M_PI * r1*r1*r1;
vol2 = 4 / 3 * M_PI * r2*r2*r2;
vol3 = vol1 + vol2;
r3 = cbrt(3 * vol3 / (4 * M_PI));
return r3;
}
Is translated by gcc (-O3) to:
_Z15drop_size_adderdd:
movapd xmm2, xmm0
mulsd xmm0, xmm0
mulsd xmm0, xmm2
movapd xmm2, xmm1
mulsd xmm2, xmm1
mulsd xmm2, xmm1
addsd xmm0, xmm2
mulsd xmm0, QWORD PTR .LC0[rip]
mulsd xmm0, QWORD PTR .LC1[rip]
jmp _Z4cbrtd
_Z16drop_size_adder2dd:
movapd xmm2, xmm0
mulsd xmm0, xmm0
mulsd xmm0, xmm2
movapd xmm2, xmm1
mulsd xmm2, xmm1
mulsd xmm2, xmm1
addsd xmm0, xmm2
mulsd xmm0, QWORD PTR .LC0[rip]
mulsd xmm0, QWORD PTR .LC1[rip]
jmp _Z4cbrtd
.LC0:
.long 0
.long 1074266112
.LC1:
.long 0
.long 1070596096
I am not fluent in assembly, but obviously the two are identical. Conclusion: If you do care about efficiency then still write your code for readability. The compiler very likely knows better how to optimize code than you.
Upvotes: 2
Reputation: 15890
In a debug build, in a debugger, that code will be much easier to step through and see what's going on, yes.
In an optimized build, the output should be identical to the less 'steppable' version[s].
You can even using something like CompilerExplorer to compare the compiled output with various compiler flags. -O2
(upper case 'o', not zero) is a commonly supported compiler flag. It'll also let you compare the output/behavior across different compilers.
Without an optimization flag, the assembly will change. With it, it'll be the same. Which is precisely what you want.
Compiler optimizations are pretty darn good these days. Optimizing your execution speed is more about algorithms/containers ("big O") and memory usage patterns ("cache coherency") than it is about tweaking the assembly just so.
Upvotes: 1