Matt Hancock
Matt Hancock

Reputation: 4039

Why tanh is faster than exp on my machine?

This question spawned from a separate question, which turned out to have some apparently machine specific quirks. When I run the C++ code listed below for recording the timing differences between tanh and exp, I see the following result:

tanh: 5.22203
exp: 14.9393

tanh runs ~3x as fast as exp. This is somewhat surprising considering the mathematical definition of tanh (and being ignorant of the algorithmic definition implemented).

What's more is that this happens on my laptop (Ubuntu 16.04, Intel Core i7-3517U CPU @ 1.90GHz × 4), but does not occur on my desktop (same OS, not sure about CPU specs right now).

I compiled the code below with g++. The above times were with no compiler optimization, although the trend remains if I use -On for each n. I also fiddled with a and b values to see if the range of values being evaluated was having an effect. This doesn't seem to matter.

What would cause tanh to be faster than exp on different machines?

#include <iostream>
#include <cmath>
#include <ctime>

using namespace std;

int main() {
    double a = -5;
    double b =  5;
    int N =  10001;
    double x[10001];
    double y[10001];
    double h = (b-a) / (N-1);

    clock_t begin, end;

    for(int i=0; i < N; i++)
        x[i] = a + i*h;

    begin = clock();

    for(int i=0; i < N; i++)
        for(int j=0; j < N; j++)
            y[i] = tanh(x[i]);

    end = clock();

    cout << "tanh: " << double(end - begin) / CLOCKS_PER_SEC << "\n";

    begin = clock();

    for(int i=0; i < N; i++)
        for(int j=0; j < N; j++)
            y[i] = exp(x[i]);

    end = clock();

    cout << "exp: " << double(end - begin) / CLOCKS_PER_SEC << "\n";


    return 0;
}

edit: some assembly output

This is output when I compile the following simplified code below with g++ -g -O -Wa,-aslh nothing2.cpp > stuff.txt.

#include <cmath>

int main() {
    double x = 0.0;
    double y,z;
    y = tanh(x);
    z = exp(x);
    return 0;
}

edit: another update

Assume nothing2.cpp contains the simplified code in the previous edit. I run:

g++ -o nothing2.so -shared -fPIC nothing2.cpp
objdump -d nothing2.so > stuff.txt

Here is the contents of stuff.txt

Upvotes: 4

Views: 1979

Answers (1)

skyking
skyking

Reputation: 14359

There is various possible explanation and the one applicable in your case depends on which platform you're using or exact which math library that is in use. But one possible explanation is:

First of all the calculation of tanh does not rely on the standard definition of tanh instead one expresses it in terms of exp(-2*x) or expm1(2*x) which means one only have to calculate one exponential which is probably the heavy operation (in addition there's a division and some additions).

Second which may be the trick is that for largish values of x this will reduce to (exp(2*x)-1)/(exp(2*x)+1) = 1 - 2/(expm1(2*x)+2). The advantage here is that since the second term is smallish it doesn't have to be calculated to the same relative accuracy to get the same final accuracy. This translate into that one wouldn't need the of expm1 here as in general.

Also for smalish values of x there's a similar trick in rewriting it as (1-exp(-2*x))/(1+exp(-2*x)) = - 1/ (1 + 2/(expm1(-2*x)+2) which again means that we can take advantage of the factor exp(-2*x) being large and not having to calculate it to the same accuracy. However you don't have to actually calculate it this way, you use the expression expm1(-2*x)/(2+expm1(-2*x)) instead with the same accuracy requirement on expm1.

In addition there are other optimizations available for larger values of x that isn't possible for exp of basically the same origin. With large x the factor expm1(2*x) will become so large that we can simply discard it entirely, while for exp we still have to calculate it (this is even the case for large negative x). For these values tanh would be immediately decided to be 1 while exp must be calculated.

Upvotes: 4

Related Questions