Reputation: 11

Understanding OFF value in glibc logf function

I am trying to understand the mechanics of the logarithm implementation in the GNU C Library that can be found under glibc > sysdeps > ieee754 > flt-32 > e_logf.c

https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/ieee754/flt-32/e_logf.c

Especially the OFF value (0x3f330000) bugs me in two ways.

How is it used to get the exponent k? I know the floating-point as integer interpretation and then shifting bits and subtracting the bias way, e.g. for a (positive) float x = 2^k * z

int k = (asuint(x) >> 23) - 127;

Here it seems to work the other way round, but I don't understand how that'd work.

Moreover, there's this comment:

/* x = 2^k z; where z is in range [OFF,2*OFF] and exact. [...] */

And I don't understand why z (the mantissa as I understand it) would lie in that interval and not [1, 2).

There's more I don't get but maybe I can figure it out once I understand what OFF stands for. Any help is appreciated!

Upvotes: 1

Answers (2)

njuffa

Reputation: 26085

The code referenced in the question is very likely based on the assumption that the C type float is mapped to the binary32 type of the IEEE-754 floating-point standard. John Bollinger's answer already addresses the bit-level manipulation of binary32 numbers used during argument reduction.

Without having looked at the code, just based on the code comments in the question it is clear that OFF represents the lower bound of the interval over which the core approximation employed by the code is valid. Specifically, the bit pattern specified by 0x3f330000 interpreted as a binary32 number corresponds to the value 0x1.660000p-1 which is approximately 0.69921875. The upper bound is twice the lower bound, or about 1.3984375, as the cited code comment points out.

Typical implementations of log ultimately approximate log1p around zero, which means they approximate log around unity. It is numerically advantageous for the lower bound of the core approximation to be >= exp(-0.5) ~= 0.60653 and the upper bound to be <= exp (0.5) ~= 1.64872, under the restriction that the upper bound is twice the lower bound. Within those constraints the particular choice of cutoff is a design parameter of the log algorithm. It is impossible to know why the code's author made this particular choice. A reasonable guess is that this choice minimizes the overall numerical error of the implementation.

Below I am showing a simple ISO-C99 implementation of logf() where I experimented with three different design choices for the cutoff value for precisely this reason, ultimately settling on the cutoff value of 2/3 for providing the smallest ulp error. This code also contains a portable implementation of the argument reduction using standard C library functions that I would hope make it easier to understand the bit-level manipulations.

#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <float.h>
#include <math.h>

int32_t float_as_int (float a) { int32_t r; memcpy (&r, &a, sizeof r); return r; }
float int_as_float (int32_t a) { float r; memcpy (&r, &a, sizeof r); return r; }

/* natural logarithm */
float my_logf (float a)
{
    float m, r, s, t, i, f;
    int32_t e;
#if LOG_VARIANT == 1
    // max ulp err = 0.86280
    const float cutoff_f = 0.707106781f;
    const int32_t cutoff_i = 0x3f3504f3;
#elif LOG_VARIANT == 2
    // max ulp err = 0.88098
    const float cutoff_f = 0.750000000f;
    const int32_t cutoff_i = 0x3f400000;
#elif LOG_VARIANT == 3
    // max ulp err = 0.85089
    const float cutoff_f = 0.666666667f;
    const int32_t cutoff_i = 0x3f2aaaab;
#endif // LOG_VARIANT
    if ((a > 0.0f) && (a <=  0x1.fffffep+127f)) { // 3.40282347e+38f
#if PORTABLE
        m = frexpf (a, &e);
        if (m < cutoff_f) {
            m = m + m;
            e = e - 1;
        }
        i = (float)e;
#else // PORTABLE
        i = 0.0f;
        /* fix up subnormal inputs */
        if (a < 0x1.0p-126f){ // 1.175494351e-38
            a = a * 0x1.0p+23f; // 8388608.0
            i = -23.0f;
        }
        e = (float_as_int (a) - cutoff_i) & 0xff800000;
        m = int_as_float (float_as_int (a) - e);
        i = fmaf ((float)e, 0x1.0p-23f, i); // 1.19209290e-7
#endif // PORTABLE
        f = m - 1.0f;
        s = f * f;
#if LOG_VARIANT == 1
        /* Compute log1p(f) for f in [sqrt(0.5)-1, sqrt(2.0)-1] */
        r =             -0x1.384000p-4f;  // -7.62329102e-2
        t =              0x1.084000p-3f;  //  1.29028320e-1
        r = fmaf (r, s, -0x1.0f218cp-3f); // -1.32388204e-1
        t = fmaf (t, s,  0x1.226b04p-3f); //  1.41805679e-1
        r = fmaf (r, s, -0x1.5427a4p-3f); // -1.66091233e-1
        t = fmaf (t, s,  0x1.99a35ap-3f); //  2.00018600e-1
        r = fmaf (r, s, -0x1.000416p-2f); // -2.50015587e-1
        r = fmaf (t, f, r);
        r = fmaf (r, f,  0x1.55555cp-2f); //  3.33333433e-1
        r = fmaf (r, f, -0x1.fffffap-2f); // -4.99999911e-1
#elif LOG_VARIANT == 2
        /* compute log1p(f), f in [-0.25, 0.5] */ 
        r =             -0x1.738000p-5f;  // -4.53491211e-2
        t =              0x1.b02000p-4f;  //  1.05499268e-1
        r = fmaf (r, s, -0x1.0ef3e4p-3f); // -1.32301122e-1
        t = fmaf (t, s,  0x1.28c48ap-3f); //  1.44906119e-1
        r = fmaf (r, s, -0x1.54d148p-3f); // -1.66414797e-1
        t = fmaf (t, s,  0x1.995f4ap-3f); //  1.99888781e-1
        r = fmaf (r, s, -0x1.000084p-2f); // -2.50001967e-1
        r = fmaf (t, f, r);
        r = fmaf (r, f,  0x1.5555d0p-2f); //  3.33335161e-1
        r = fmaf (r, f, -0x1.000000p-1f); // -5.00000000e-1
#elif LOG_VARIANT == 3
        /* Compute log1p(f) for f in [-1/3, 1/3] */
        r =             -0x1.0ae000p-3f;  // -0.130310059
        t =              0x1.208000p-3f;  //  0.140869141
        r = fmaf (r, s, -0x1.f1988ap-4f); // -0.121483363
        t = fmaf (t, s,  0x1.1e5740p-3f); //  0.139814854
        r = fmaf (r, s, -0x1.55b36ep-3f); // -0.166846141
        t = fmaf (t, s,  0x1.99d8b2p-3f); //  0.200120345
        r = fmaf (r, s, -0x1.fffe02p-3f); // -0.249996200
        r = fmaf (t, f, r);
        r = fmaf (r, f,  0x1.5554fap-2f); //  0.333331972
        r = fmaf (r, f, -0x1.000000p-1f); // -0.500000000
#endif // LOG_VARIANT
        r = fmaf (r, s, f);
        r = fmaf (i, 0x1.62e430p-01f, r); //  6.93147182e-1  // log(2)
    } else {
        r = a + a;  // silence NaNs if necessary
        if (a  < 0.0f) r = 0.0f/0.0f; // QNaN INDEFINITE
        if (a == 0.0f) r = -INFINITY; // -INF
    }
    return r;
}

Upvotes: 1

John Bollinger

Reputation: 180048

Taking the second question first,

Moreover, there's this comment:
/* x = 2^k z; where z is in range [OFF,2*OFF] and exact. [...] */
And I don't understand why z (the mantissa as I understand it) would lie in that interval and not [1, 2).

With respect to that formula, interpreting the place values of the bits of z's binary representation such that z always falls in any particular non-empty interval is just a question how you choose k. Thus, that z lies in the particular interval specified in the comment can be taken as a definition. Without such a definition, there are infinitely many pairs of z and k that satisfy the formula.

How is [OFF] used to get the exponent k? I know the floating-point as integer interpretation and then shifting bits and subtracting the bias way, e.g. for a (positive) float x = 2^k * z

int k = (asuint(x) >> 23) - 127;

Here it seems to work the other way round, but I don't understand how that'd work.

You can shift then subtract or subtract a different, shifted value then shift. You can convert back and forth between the two forms algebraically. It's just distributing a multiplicative operation over an additive one vs. factoring it out.

The value defined for OFF is equal to ((126 << 23) + 0x330000), which is strictly between 126 << 23 and 127 << 23. The code you linked does this:

tmp = ix - OFF;

[...]

k = (int32_t) tmp >> 23; /* arithmetic shift */

Given the defined value of OFF, that's the same as ...

k = (ix - (126 << 23) - 0x330000) >> 23;

. The value 0x330000 has only 22 significant bits, so the result of that computation will be either equal to (ix >> 23) - 126 (in most cases; does that look familiar?) or one less than that. It's computing the exponent based on a bias of 126.

Why 126 instead of 127? I think it's because the computation does not include the the implicit leading 1 bit in the mantissa.

The question remains what the role of the extra 0x330000 is. I'm not positive, but I am inclined to guess that it is some sort of tweak parameter.

Upvotes: 1

Understanding OFF value in glibc logf function

Answers (2)

Related Questions