JustBlossom
JustBlossom

Reputation: 1329

C Bit-Level Int to Float Conversion Unexpected Output

Background:
I am playing around with bit-level coding (this is not homework - just curious). I found a lot of good material online and in a book called Hacker's Delight, but I am having trouble with one of the online problems.

It asks to convert an integer to a float. I used the following links as reference to work through the problem:

How to manually (bitwise) perform (float)x?
How to convert an unsigned int to a float?
http://locklessinc.com/articles/i2f/

Problem and Question:
I thought I understood the process well enough (I tried to document the process in the comments), but when I test it, I don't understand the output.

Test Cases:
float_i2f(2) returns 1073741824
float_i2f(3) returns 1077936128

I expected to see something like 2.0000 and 3.0000.

Did I mess up the conversion somewhere? I thought maybe this was a memory address, so I was thinking maybe I missed something in the conversion step needed to access the actual number? Or maybe I am printing it incorrectly? I am printing my output like this:

printf("Float_i2f ( %d ): ", 3);
printf("%u", float_i2f(3));
printf("\n");

But I thought that printing method was fine for unsigned values in C (I'm used to programming in Java).

Thanks for any advice.

Code:

/*
    * float_i2f - Return bit-level equivalent of expression (float) x
    *   Result is returned as unsigned int, but
    *   it is to be interpreted as the bit-level representation of a
    *   single-precision floating point values.
    *   Legal ops: Any integer/unsigned operations incl. ||, &&. also if, while
    *   Max ops: 30
    *   Rating: 4
    */
    unsigned float_i2f(int x) {
        if (x == 0){
            return 0;
        }

        //save the sign bit for later and get the asolute value of x
        //the absolute value is needed to shift bits to put them
        //into the appropriate position for the float
        unsigned int signBit = 0;
        unsigned int absVal = (unsigned int)x;

        if (x < 0){
            signBit = 0x80000000;
            absVal = (unsigned int)-x;
        }

        //Calculate the exponent
        // Shift the input left until the high order bit is set to form the mantissa.
        // Form the floating exponent by subtracting the number of shifts from 158.
        unsigned int exponent = 158; //158 possibly because of place in byte range

        while ((absVal & 0x80000000) == 0){//this checks for 0 or 1. when it reaches 1, the loop breaks
            exponent--;
            absVal <<= 1;
        }

        //find the mantissa (bit shift to the right)
        unsigned int mantissa = absVal >> 8;

        //place the exponent bits in the right place
        exponent = exponent << 23;

        //get the mantissa
        mantissa = mantissa & 0x7fffff;

        //return the reconstructed float
        return signBit | exponent | mantissa;
    }

Upvotes: 2

Views: 2847

Answers (2)

paddy
paddy

Reputation: 63451

I'll just chime in here, because nothing specifically about endianness has been addressed. So let's talk about it.

  1. The construction of the value in the original question was endianness-agnostic, using shifts and other bitwise operations. This means that regardless of whether your system is big- or little-endian, the actual value will be the same. The difference will be its byte order in memory.

  2. The generally accepted convention for IEEE-754 is that the byte order is big-endian (although I believe there is no formal specification of this, and therefore no requirement on implementations to follow it). This means if you want to directly interpret your integer value as a float, it needs to be laid out in big-endian byte order.

So, you can use this approach combined with a union if and only if you know that the endianness of floats and integers on your system is the same.

On the common Intel-based architectures this is not okay. On those architectures, integers are little-endian and floats are big-endian. You need to convert your value to big-endian. A simple approach to this is to repack its bytes even if they are already big-endian:

uint32_t n = float_i2f( input_val );
uint8_t char bytes[4] = {
    (uint8_t)((n >> 24) & 0xff),
    (uint8_t)((n >> 16) & 0xff),
    (uint8_t)((n >> 8) & 0xff),
    (uint8_t)(n & 0xff)
};
float fval;
memcpy( &fval, bytes, sizeof(float) );

I'll stress that you only need to worry about this if you are trying to reinterpret your integer representation as a float or the other way round.

If you're only trying to output what the representation is in bits, then you don't need to worry. You can just display your integer in a useful form such as hex:

printf( "0x%08x\n", n );

Upvotes: 0

David C. Rankin
David C. Rankin

Reputation: 84521

Continuing from the comment. Your code is correct, and you are simply looking at the equivalent unsigned integer made up by the bits in your IEEE-754 single-precision floating point number. The IEEE-754 single-precision number format (made up of the sign, extended exponent, and mantissa), can be interpreted as a float, or those same bits can be interpreted as an unsigned integer (just the number that is made up by the 32-bits). You are outputting the unsigned equivalent for the floating point number.

You can confirm with a simple union. For example:

#include <stdio.h>
#include <stdint.h>

typedef union {
    uint32_t u;
    float f;
} u2f;

int main (void) {

    u2f tmp = { .f = 2.0 };
    printf ("\n u : %u\n f : %f\n", tmp.u, tmp.f);

    return 0;
}

Example Usage/Output

$ ./bin/unionuf

 u : 1073741824
 f : 2.000000

Let me know if you have any further questions. It's good to see that your study resulted in the correct floating point conversion. (also note the second comment regarding truncation/rounding)

Upvotes: 3

Related Questions