Reputation: 2907
From a friend of mine, I heard that the pow function is slower than its equivalent in simply multiplying the base by itself, the amount of times as its exponent. For example, according to him,
#include <stdio.h>
#include <math.h>
int main () {
double e = 2.71828
e2 = pow (e, 2.0)
printf("%le", e2)
}
is slower than
#include <stdio.h>
int main() {
double e = 2.71828
e2 = e * e
printf("%le", e2)
}
As a novice, I would think they both compile at the same speed, and by the same logic, I would prefer the former for its typical pithiness. So, why is the former block of code slower than the latter one?
Upvotes: 2
Views: 2049
Reputation: 126827
Because the pow
function must implement a more generic algorithm that has to work on all the cases (in particular, it must be able to elevate to any rational exponent representable by a double
), while e*e
is just a simple multiplication that will boil down to one or two assembly instructions.
Still, if the compiler is smart enough, it may automatically replace your pow(e, 2.0)
with e*e
automatically anyway (well, actually in your case it will probably just perform the whole computation at compile time).
Just for fun, I ran some tests: compiling the following code
#include <math.h>
double pow2(double value)
{
return pow(value, 2.);
}
double knownpow2()
{
double e=2.71828;
return pow(e, 2.);
}
double valuexvalue(double value)
{
return value*value;
}
double knownvaluexvalue()
{
double e=2.71828;
return e*e;
}
with g++ -O3 -c pow.c
(g++ 4.7.3) and disassembling the output with objdump -d -M intel pow.o
I get:
0000000000000000 <_Z4pow2d>:
0: f2 0f 59 c0 mulsd xmm0,xmm0
4: c3 ret
5: 66 66 2e 0f 1f 84 00 data32 nop WORD PTR cs:[rax+rax*1+0x0]
c: 00 00 00 00
0000000000000010 <_Z9knownpow2v>:
10: f2 0f 10 05 00 00 00 movsd xmm0,QWORD PTR [rip+0x0] # 18 <_Z9knownpow2v+0x8>
17: 00
18: c3 ret
19: 0f 1f 80 00 00 00 00 nop DWORD PTR [rax+0x0]
0000000000000020 <_Z11valuexvalued>:
20: f2 0f 59 c0 mulsd xmm0,xmm0
24: c3 ret
25: 66 66 2e 0f 1f 84 00 data32 nop WORD PTR cs:[rax+rax*1+0x0]
2c: 00 00 00 00
0000000000000030 <_Z16knownvaluexvaluev>:
30: f2 0f 10 05 00 00 00 movsd xmm0,QWORD PTR [rip+0x0] # 38 <_Z16knownvaluexvaluev+0x8>
37: 00
38: c3 ret
So, where the compiler already knew all the values involved it just performed the computation at compile-time; and for both pow2
and valuexvalue
it emitted a single mulsd xmm0,xmm0
(i.e. in both cases it boils down to the multiplication of the value with itself in a single assembly instruction).
Upvotes: 4
Reputation: 7719
Here is one (simple, heed the comment) pow implementation. In being generic it involves a number of branches a potential division and calls to exp, log, modf ..
On the other hand, on the multiplication is a single instruction (give or take) on most higher CPUs.
Upvotes: 0
Reputation: 564451
pow(double,double)
needs to handle raising to any power, not just an integer based power, or especially 2
. As such, it's far more complicated than just doing a simple multiplication of two double values.
Upvotes: 5