Reputation: 91
This is my first time posting.
So here is my problem, I don't understand the following example.
Binary representation: 01000000011000000000000000000000
=+(1.11)base 2x 2^(128-127)
<-all questions refer to this line.
• =+(1.11)base 2 x2^1
• =+(11.1) base 2
• =+(1x21+1x20+1x2-1)=(3.5) base 10
Questions:
Where does the 128-127 come from?
Why is it 1.11?
Upvotes: 0
Views: 5836
Reputation: 9572
I think that the rationale for having a bias (+127) in the exponent is that:
if you interpret the float as a 32bit integer, then you don't change the order.
That is
float a,b;
assert((a < b) == ((int)(a) < (int)(b)));
You thus have to debias the exponent by subtracting 127...
EDIT: the inequality works for regular float, but not for NaN
Upvotes: 0
Reputation: 140236
In single precision floating point format, the exponent bias is constant 127. And the particular bit pattern you gave encodes a float with 128 (1000000) as exponent:
0 10000000 11000000000000000000000
s exponent fraction
First look at the sign (s) bit, it's 0. So it's a positive number.
Then you subtract the exponent bias from the exponent, which is where 128 - 127 comes from. This gives 1
.
Then we start adding the bits in the fraction together (11000000000000000000000
):
1 + 0.5 + 0.25 + 0 + 0 + 0....
Gives 1.75
Now we have 1(sign) * 2^1(exponent) * 1.75(fraction) = 2 * 1.75 = 3.5
Another example:
00111110101010101010101010101011
Break it down:
0 01111101 01010101010101010101011
s exponent fraction
Sign is 0, so it's Positive number again.
125 (01111101) exponent, subtract exponent bias from it: 125 - 127 = -2
Decode the fraction 01010101010101010101011
1 + 0 + 0.25 + 0 + 0.0625 + 0 + 0.015625 + 0 + 0.00390625 + 0 + 0.0009765625 + 0 + 0.000244140625 + 0 + 0.00006103515625 + 0 + 0.0000152587890625 + 0 + 0.000003814697265625 + 0 + 9.5367431640625e-7 + 0 + 2.384185791015625e-7 + 1.1920928955078125e-7
This gives 1.3333333730697632
or so.
Now add it all together:
1(sign) * 2^-2(exponent) * 1.3333333730697632(fraction) = 0.25 * 1.3333333730697632 = 0.3333333432674408 =~ 0.3333333
Upvotes: 2
Reputation: 2780
This tutorial should give you a better understanding of floating points:
http://www.tfinley.net/notes/cps104/floating.html
The binary representation is broken down into 3 parts: 1 sign bit, 8 exponent bits, and 23 mantissa bits.
0|10000000|11000000000000000000000
sign|exponent| mantissa
The sign bit is zero, meaning it is a positive number. The exponent (128), which is 127 greater than the actual value by definition, resolves to 1 (i.e. 128 - 127). The mantissa is 1.11 (the leading 1 is implied, again by definition). So therefore, we have
01000000011000000000000000000000
= +(1.11)base 2 x 2^(128-127)
= (2^0 + 2^-1 + 2^-2) x 2^1
= 2^1 + 2^0 + 2^-1
= 2 + 1 + 0.5
= 3.5
Upvotes: 0
Reputation: 6053
First of all, the very first thing you have to do is separate the fields (given IEEE 754 32-bit Floating Point encoding):
Sign bit: 0
Exponent bits: 10000000
Mantissa bits: 11000000000000000000000
The (128 - 127) is calculating the exponent by subtracting the exponent bias.
When converting from floating point to decimal, you subract the exponent bias. When converting the other way, you add it. The exponent bias is calculated as:
2^(k−1) − 1 where k is the number of bits in the exponent field.
2^(8 - 1) - 1 = 127
The mantissa is 1.11 as base 2 (binary). The mantissa is composed of a fraction and has an implied leading 1. Hence, with 11000... in the mantissa bits, you have an implied leading one to give you 1.11
Had the mantissa bits been 1011, your value of the fraction would be 1.011
Upvotes: 0