Vortico
Vortico

Reputation: 2749

Get the positions of the 'ones' digits in a base-2 representation of a C float

Say I have a floating point number. I would like to extract the positions of all the ones digits in the number's base 2 representation.

For example, 10.25 = 2^-2 + 2^1 + 2^3, so its base-2 ones positions are {-2, 1, 3}.

Once I have the list of base-2 powers of a number n, the following should always return true (in pseudocode).

sum = 0
for power in powers:
    sum += 2.0 ** power
return n == sum

However, it is somewhat difficult to perform bit logic on floats in C and C++, and even more difficult to be portable.

How would one implement this in either of the languages with a small number of CPU instructions?

Upvotes: 2

Views: 246

Answers (2)

Eric Postpischil
Eric Postpischil

Reputation: 223359

There is no need to work with the encoding of floating-point numbers. C provides routines for working with floating-point values in a portable way. The following works.

#include <math.h>
#include <stdio.h>
#include <stdlib.h>


int main(int argc, char *argv[])
{
    /*  This should be replaced with proper allocation for the floating-point
        type.
    */
    int powers[53];
    double x = atof(argv[1]);

    if (x <= 0)
    {
        fprintf(stderr, "Error, input must be positive.\n");
        return 1;
    }

    // Find value of highest bit.
    int e;
    double f = frexp(x, &e) - .5;
    powers[0] = --e;
    int p = 1;

    // Find remaining bits.
    for (; 0 != f; --e)
    {
        printf("e = %d, f = %g.\n", e, f);
        if (.5 <= f)
        {
            powers[p++] = e;
            f -= .5;
        }
        f *= 2;
    }

    // Display.
    printf("%.19g =", x);
    for (int i = 0; i < p; ++i)
        printf(" + 2**%d", powers[i]);
    printf(".\n");

    // Test.
    double y = 0;
    for (int i = 0; i < p; ++i)
        y += ldexp(1, powers[i]);

    if (x == y)
        printf("Reconstructed number equals original.\n");
    else
        printf("Reconstructed number is %.19g, but original is %.19g.\n", y, x);

    return 0;
}

Upvotes: 4

Dietrich Epp
Dietrich Epp

Reputation: 213548

Give up on portability, assume IEEE float and 32-bit int.

// Doesn't check for NaN or denormalized.
// Left as an exercise for the reader.
void pbits(float x)
{
    union {
        float f;
        unsigned i;
    } u;
    int sign, mantissa, exponent, i;
    u.f = x;
    sign = u.i >> 31;
    exponent = ((u.i >> 23) & 255) - 127;
    mantissa = (u.i & ((1 << 23) - 1)) | (1 << 23);
    for (i = 0; i < 24; ++i) {
        if (mantissa & (1 << (23 - i)))
            printf("2^%d\n", exponent - i);
    }
}

This will print out the powers of two that sum to the given floating point number. For example,

$ ./a.out 156
2^7
2^4
2^3
2^2
$ ./a.out 0.3333333333333333333333333
2^-2
2^-4
2^-6
2^-8
2^-10
2^-12
2^-14
2^-16
2^-18
2^-20
2^-22
2^-24
2^-25

You can see how 1/3 is rounded up, which is not intuitive since we would always round it down in decimal, no matter how many decimal places we use.

Footnote: Don't do the following:

float x = ...;
unsigned i = *(unsigned *) &x; // no

The trick with the union is far less likely to generate warnings or confuse the compiler.

Upvotes: 4

Related Questions