Reputation: 967
I understand that the floating points are represented in memory using sign, exponent and mantissa form which have limited number of bits to represent each part and hence this leads to rounding errors. Essentially, lets say if i have a floating point number, then due to certain number of bits it basically gets mapped to one of the nearest representable form using he rounding strategy.
Does this mean that 2 different floating points can get mapped to same memory representation? If yes, then how can i avoid it programmatically?
I came across this std::numeric_limits<T>::max_digits10
It says the minimum number of digits needed in a floating point number to survive a round trip from float to text to float.
Where does this round trip happens in a c++ program i write. As far as i understand, i have a float f1 which is stored in memory (probably with rounding error) and is read back. I can directly have another float variable f2 in c++ program and then can compare it with original floating point f1. Now my question is when will i need std::numeric_limits::max_digits10 in this use case? Is there any use case which explains that i need to use std::numeric_limits::max_digits10 to ensure that i don't do things wrong.
Can anyone explain the above scenarios?
Upvotes: 4
Views: 532
Reputation: 3677
You seem to be confusing two sources of rounding (and precision loss) with floating point numbers.
The first one is due to the way floating point numbers are represented in memory, which uses binary numbers for the mantissa and exponent, as you just pointed. The classic example being :
const float a = 0.1f;
const float b = 0.2f;
const float c = a+b;
printf("%.8f + %.8f = %.8f\n",a,b,c);
which will print
0.10000000 + 0.20000000 = 0.30000001
There, the mathematically correct result is 0.3, but 0.3 is not representable with the binary representation. Instead you get the closest number which can be represented.
The other one, which is where max_digits10
comes into play, is for text representation of floating point number, for example, when you do printf
or write to a file.
When you do this using the %f
format specifier you get the number printed out in decimal.
When you print the number in decimal you may decide how many digits get printed out. In some cases you might not get an exact printout of the actual number.
For example, consider
const float x = 10.0000095f;
const float y = 10.0000105f;
printf("x = %f ; y = %f\n", x,y);
this will print
x = 10.000010 ; y = 10.000010
on the other hand, increasing the precision of printf
to 8 digits with %.8f
will give you.
x = 10.00000954 ; y = 10.00001049
So if you wanted to save these two float values as text to a file using fprintf
or ofstream
with the default number of digits, you may have saved the same value twice where you originally had two different values for x
and y
.
max_digits10
is the answer to the question "how many decimal digits do I need to write in order to avoid this situation for all possible values ?". In other words, if you write your float with max_digits10
digits (which happens to be 9 for floats) and load it back, you're guaranteed to get the same value you started with.
Note that the decimal value written may be different from the floating point number's actual value (due to the different representation. But it is still guaranteed than when you read the text of the decimal number into a float
you will get the same value.
See the code runt there : https://ideone.com/pRTMZM
Say you have your two float
s from earlier,
const float x = 10.0000095f;
const float y = 10.0000105f;
and you want to save them to text (a typical use-case would be saving to a human-readable format like XML or JSON, or even using prints to debug). In my example I'll just write to a string using stringstream
.
Let's try first with the default precision :
stringstream def_prec;
def_prec << x <<" "<<y;
// What was written ?
cout <<def_prec.str()<<endl;
The default behaviour in this case was to round each of our numbers to 10
when writing the text. So now if we use that string to read back to two other floats, they will not contain the original values :
float x2, y2;
def_prec>>x2 >>y2;
// Check
printf("%.8f vs %.8f\n", x, x2);
printf("%.8f vs %.8f\n", y, y2);
and this will print
10 10
10.00000954 vs 10.00000000
10.00001049 vs 10.00000000
This round trip from float to text and back has erased a lot of digits, which might be significant. Obviously we need to save our values to text with more precision than this. The documentation guarantees that using max_digits10
will not lose data in the round trip. Let's give it a try using setprecision
:
const int digits_max = numeric_limits<float>::max_digits10;
stringstream max_prec;
max_prec << setprecision(digits_max) << x <<" "<<y;
cout <<max_prec.str()<<endl;
This will now print
10.0000095 10.0000105
So our values were saved with more digits this time. Let's try reading back :
float x2, y2;
max_prec>>x2 >>y2;
printf("%.8f vs %.8f\n", x, x2);
printf("%.8f vs %.8f\n", y, y2);
Which prints
10.00000954 vs 10.00000954
10.00001049 vs 10.00001049
Aha ! We got our values back !
Finally, let's see what happens if we use one digit less than max_digits10
.
stringstream some_prec;
some_prec << setprecision(digits_max-1) << x <<" "<<y;
cout <<some_prec.str()<<endl;
Here this is what we get saved as text
10.00001 10.00001
And we read back :
10.00000954 vs 10.00000954
10.00001049 vs 10.00000954
So here, the precision was enough to keep the value of x
but not the value of y
which was rounded down. This means we need to use max_digits10
if we want to make sure different floats can make the round trip to text and stay different.
Upvotes: 2
Reputation: 154315
Why do we need std::numeric_limits::max_digits10?
To know how many significant decimal digits to convert a floating point type to text distinctively for all possible values of that type.
Does this mean that 2 different floating points can get mapped to same memory representation? If yes, then how can i avoid it programmatically?
No, different floating point objects, that differ in value, will have different encoding.
Yes, different floating point code, that differ in text, may map to same memory representation. x1, x2
below certainly have the same encoding. A 32-bit float
can only encode about 232 different values. Many different floating point constants map to the same float
.
float x1 = 1.000000000000000001f;
float x2 = 1.000000000000000001000000000000000001f;
assert(x1 == x2);
Where does this round trip happens in a c++ program i write. Now my question is when will i need std::numeric_limits::max_digits10 in this use case? Is there any use case which explains that i need to use std::numeric_limits::max_digits10 to ensure that i don't do things wrong.
If code converts a floating point x
to string s
and then back to floating point y
, then that is the round trip of concern.
For x == y
to hold true, then s
should contain at least max_digits10
significant decimal digits to work for all x
.
With fewer than max_digits10
significant decimal digits, x == y
may still be true for some x
, but not all.
With more than max_digits10
significant decimal digits, x == y
is true for all x
, yet s
grows unnecessarily long.
Significant decimal digits
The significant digit count begins is not the number of digits to the right of the .
, but the count from the most significant non-zero digit. All below, as code or text, have 9 significant decimal digits.
1.23456789
12345.6789
123456789.
123456789f
1.23456789e10
1.23456789e-10
-1.23456789
12345.0000
00012345.6789
Upvotes: 3
Reputation: 473946
Where does this round trip happens in a c++ program i write.
That depends on the code you write, but an obvious place would be... any floating-point literal you put in your code:
float f = 10.34529848505433;
Will f
be exactly that number? No. It will be an approximation of that number because most implementations of float
can't store that much precision. If you changed the literal to 10.34529848505432
, odds are good f
will have the same value.
This is not about round-tripping per-se. The standard defines max_digits10
purely in terms of going from decimal to float:
Number of base 10 digits required to ensure that values which differ are always differentiated.
Upvotes: 1
Reputation: 114440
Forget about the exact representation for a minute, and pretend you have a two bit float. Bit 0 is 1/2, and bit 1 is 1/4. Let's say you want to transform this number into a string, such that when the string is parsed, it yields the original number.
Your possible numbers are 0, 1/4, 1/2, 3/4. Clearly you can represent all of them with two digits past the decimal point and get the same number back, since the representation is exact in this case. But can you get away with a single digit?
Assuming half always rounds up, the numbers map to 0, 0.3, 0.5, 0.8. The first and third numbers are exact while the second and fourth are not. So what happens when you try to parse them back?
0.3 - 0.25 < 0.5 - 0.3
, and 0.8 - 0.75 < 1 - 0.8
. So clearly in both cases the rounding works out. That means you only need one digit past the decimal point to capture the value of our contrived two-bit floats.
You can expand the number of bits from two to 53 (for a double
), and add an exponent to alter the scale of the number, but the concept is exactly the same.
Upvotes: 3