Reputation: 6023
I wrote a socket server that reads data from some devices. After reading data binary shift is applied on bytes. After that i get an integer value for instance 1108304047
and i want to convert this number to IEEE 754 float 35.844417572021484. I found some solutions with struct.unpack but it doesn't seem to me rational. First we convert number to string then convert to float.
Is there any short way like Float.intBitsToFloat(1108304047)
in Java.
The solution that i found with struct.unpack is quite long. it contains string conversion, sub string fetching, zero filling etc..
def convert_to_float(value):
return struct.unpack("!f", hex(value)[2:].zfill(8).decode('hex'))[0]
Upvotes: 1
Views: 11472
Reputation: 492
ldexp
and frexp
decompositions for positive numbers.If you are okay with up to 2-16 amount of relative error, you can express both sides of the transformation using just basic arithmetic and the ld/frexp
decomposition.
Note that this is much slower than the struct hack, which can be more succinctly represented as struct.unpack('f', struct.pack('I', value))
.
Here is the decomposition method.
def to_bits(x):
man, exp = math.frexp(x)
return int((2 * man + (exp + 125)) * 0x800000)
def from_bits(y):
y -= 0x3e800000
return math.ldexp(
float(0x800000 + y & 0x7fffff) / 0x1000000,
(y - 0x800000) >> 23)
While the from_bits
function looks scarier, it is actually nothing more than the inverse of to_bits
, modified so that we only perform a single floating point division (not because of speed considerations, just because it should be the sort of mindset we have when we do need to work with machine representations of floats). Therefore, I'll focus on explaining the forward transformation.
Recall that a (positive) IEEE 754 floating point number is represented as a tuple of a biased exponent and its mantissa. The lower 23 bits m are the mantissa, and the upper 8 bits e (minus the most significant bit, which we assume to always be zero) represent the exponent, so that
x = (1 + m / 223) * 2e - 127
Let man' = m / 223 and exp' = e - 127, then 0 <= man' < 1 and exp' is an integer. Therefore
(man' + exp' + 127) * 223
gives the IEEE 754 representation.
On the other hand, the frexp
decomposition computes a pair man, exp = frexp(x)
such that man * 2exp = x, and 0.5 <= man < 1.
A moment of thought will show that man' = 2 * man - 1 and exp' = exp - 1, therefore its IEEE machine representation is
(man' + exp' + 127) * 0x800000 = (2 * man + exp + 125) * 0x800000
How much roundoff error do we expect? Well, let's assume that frexp
introduces no error within its decomposition. This is unfortunately impossible, but we can relax this down the line.
The main feature is the computation 2 * man + (exp + 125)
. Why? 0x800000
is an perfect power of two, and therefore a floating point multiplication of a power of two will nearly always be lossless (unless we overflow), since the FPU is just adding 23 << 23
to its machine representation (without touching the Mantissa, which is when error arise). Similarly, the multiplication 2 * man
is also lossless (akin to just adding 1 << 23
to the machine representation). Furthermore, exp
and 125 are integers, so (exp + 125)
is also computed to exact precision.
Therefore, we are left to analyze the error behavior of m + e
, where 1 <= m
< 1 and |e| < 127. In the worst case, m
has all 23 bits filled (corresponding to m = 2 - 2-22) and e
= +/- 127. Here, this addition will unfortunately clobber the 8 least significant bits of m
, since it has to renormalize m
(which is at the exponential range of 20) to the exponential range of 28, which means losing 8 bits. However, since a mantissa has 24 significant bits, we effectively lose 2-(24 - 8) amount of precision, which upper-bounds the error.
In a similar line of reasoning for from_bits
, you can show that float(0x800000 + y & 0x7fffff)
is basically computing the operation (1.0f + m), where m
may have up to 23 bits of precision and it is strictly less than 1. Therefore, we're adding a precise number at the scale of 20 with another number at the scale of 2-1, so we expect a loss of one bit. This then suggests that we would incur up to 2-22 relative error in the backwards transformation.
Both of these transformations incur very little roundoff, and if you throw in an extra multiplication into to_bits
, you can also bring its error down to just 2-22.
Do not do this in production.
This is just a clever float-hack that seems fun. It's not meant to be anything more than that.
Upvotes: 1
Reputation: 3876
As you can see in Java they are using structure to do the trick.
/*
* Find the float corresponding to a given bit pattern
*/
JNIEXPORT jfloat JNICALL
Java_java_lang_Float_intBitsToFloat(JNIEnv *env, jclass unused, jint v)
{
union {
int i;
float f;
} u;
u.i = (long)v;
return (jfloat)u.f;
}
In Python it is not possible to do this way, hence you need to use the struct
library
This module performs conversions between Python values and C structs represented as Python strings
First the number is converted to representation of long
packed_v = struct.pack('>l', b)
and then is unpacked to float
f = struct.unpack('>f', packed_v)[0]
That's similar as in Java.
def intBitsToFloat(b):
s = struct.pack('>l', b)
return struct.unpack('>f', s)[0]
Please correct me if I'm wrong.
Upvotes: 3