Reputation: 4039
This question spawned from a separate question, which turned out to have some apparently machine specific quirks. When I run the C++ code listed below for recording the timing differences between tanh
and exp
, I see the following result:
tanh: 5.22203
exp: 14.9393
tanh
runs ~3x as fast as exp
. This is somewhat surprising considering the mathematical definition of tanh
(and being ignorant of the algorithmic definition implemented).
What's more is that this happens on my laptop (Ubuntu 16.04, Intel Core i7-3517U CPU @ 1.90GHz × 4), but does not occur on my desktop (same OS, not sure about CPU specs right now).
I compiled the code below with g++
. The above times were with no compiler optimization, although the trend remains if I use -On
for each n
. I also fiddled with a
and b
values to see if the range of values being evaluated was having an effect. This doesn't seem to matter.
What would cause tanh
to be faster than exp
on different machines?
#include <iostream>
#include <cmath>
#include <ctime>
using namespace std;
int main() {
double a = -5;
double b = 5;
int N = 10001;
double x[10001];
double y[10001];
double h = (b-a) / (N-1);
clock_t begin, end;
for(int i=0; i < N; i++)
x[i] = a + i*h;
begin = clock();
for(int i=0; i < N; i++)
for(int j=0; j < N; j++)
y[i] = tanh(x[i]);
end = clock();
cout << "tanh: " << double(end - begin) / CLOCKS_PER_SEC << "\n";
begin = clock();
for(int i=0; i < N; i++)
for(int j=0; j < N; j++)
y[i] = exp(x[i]);
end = clock();
cout << "exp: " << double(end - begin) / CLOCKS_PER_SEC << "\n";
return 0;
}
This is output when I compile the following simplified code below with g++ -g -O -Wa,-aslh nothing2.cpp > stuff.txt
.
#include <cmath>
int main() {
double x = 0.0;
double y,z;
y = tanh(x);
z = exp(x);
return 0;
}
Assume nothing2.cpp
contains the simplified code in the previous edit. I run:
g++ -o nothing2.so -shared -fPIC nothing2.cpp
objdump -d nothing2.so > stuff.txt
Here is the contents of stuff.txt
Upvotes: 4
Views: 1979
Reputation: 14359
There is various possible explanation and the one applicable in your case depends on which platform you're using or exact which math library that is in use. But one possible explanation is:
First of all the calculation of tanh
does not rely on the standard definition of tanh
instead one expresses it in terms of exp(-2*x)
or expm1(2*x)
which means one only have to calculate one exponential which is probably the heavy operation (in addition there's a division and some additions).
Second which may be the trick is that for largish values of x
this will reduce to (exp(2*x)-1)/(exp(2*x)+1) = 1 - 2/(expm1(2*x)+2)
. The advantage here is that since the second term is smallish it doesn't have to be calculated to the same relative accuracy to get the same final accuracy. This translate into that one wouldn't need the of expm1
here as in general.
Also for smalish values of x
there's a similar trick in rewriting it as (1-exp(-2*x))/(1+exp(-2*x)) = - 1/ (1 + 2/(expm1(-2*x)+2)
which again means that we can take advantage of the factor exp(-2*x)
being large and not having to calculate it to the same accuracy. However you don't have to actually calculate it this way, you use the expression expm1(-2*x)/(2+expm1(-2*x))
instead with the same accuracy requirement on expm1
.
In addition there are other optimizations available for larger values of x
that isn't possible for exp
of basically the same origin. With large x
the factor expm1(2*x)
will become so large that we can simply discard it entirely, while for exp
we still have to calculate it (this is even the case for large negative x
). For these values tanh
would be immediately decided to be 1
while exp
must be calculated.
Upvotes: 4