Reputation: 33854
I am looking into a system developed to be used by people who don't understand floating point arithmetic. For this reason the implementation of comparison for floating point numbers is not exposed to the people using the system. Currently comparisons of floating point numbers occur like this (And this cannot change due to legacy reasons):
// If either number is not finite, do default comparison
if (!IsFinite(num1) || !IsFinite(num2)) {
output = (num1 == num2);
} else {
// Get exponents of both numbers to determine epsilon for comparison
tmp = (OSINT32*)&num1+1;
exp1 = (((*tmp)>>20)& 0x07ff) - 1023;
tmp = (OSINT32*)&num2+1;
exp2 = (((*tmp)>>20)& 0x07ff) - 1023;
// Check if exponent is the same
if (exp1 != exp2) {
output = false;
} else {
// Calculate epsilon based on the magic number 47 (presumably calculated experimentally)?
epsilon = pow(2.0,exp1-47);
output = (fabs(num2-num1) <= eps);
}
}
The crux of it is, we calculate the epsilon based on the exponent of the number to stop users of the interface from making floating point comparison mistakes. A BIG NOTE: This is for people who are not software programmers so when they do pow(sqrt(2), 2) == 2
they don't get a big surprise. Maybe this is not the best idea, but like i said, it cannot be changed.
We are having trouble figuring out how to display numbers to the user. In the past they simply displayed the number to 15 significant digits. But this results in problems of the following type:
>> SHOW 4.1 MOD 1
>> 0.099999999999999996
>> SHOW (4.1 MOD 1) == 0.1
>> TRUE
The comparison calls this correct because of the generated epsilon. But the printing of the number is confusing for people, how is 0.099999999999999996 = 0.1
?. We need a way to show the number such that it represents the shortest number of significant bits to which a number compared to it would be TRUE. So for 0.099999999999999996 this would be 0.1, for 0.569999999992724327 it would be 0.569999999992725.
Is this possible?
Upvotes: 3
Views: 130
Reputation: 347
We need a way to show the number such that it represents the shortest number of significant bits to which a number compared to it would be TRUE.
Can't you just do it the brute-force-ish way?
float num = 0.09999999;
for (int precision = 0; precision < MAX_PRECISION; ++precision) {
std::stringstream str;
float tmp = 0;
str << std::fixed << std::setprecision(precision) << num;
str >> tmp;
if (num == tmp) {
std::cout << std::fixed << std::setprecision(precision) << num;
break;
}
}
Upvotes: 0
Reputation: 11
It is not possible to avoid confusing users given the constraints you've specified. For one thing, 0.0999999999999996447 compares equal to 0.1, and 0.1000000000000003664 compares equal to 0.1, but 0.0999999999999996447 does not compare equal to 0.1000000000000003664. For another, 2.00000000000001421 compares equal to 2.0, but 1.999999999999999778 does not compare equal to 2.0 even though it's much closer to 2.0 than 2.00000000000001421 is.
Enjoy.
Upvotes: -1
Reputation: 16907
You could calculate (num - pow(2.0, exp - 47))
and (num + pow(2.0, exp - 47))
, convert both to string and search the smallest decimal between the range.
The exact value of a double is mantissa * pow(2.0, exp - 51)
with an integer value mantissa
, so if you add/subtract pow(2.0, exp - 47)
you change the mantissa by 2^4
, which should be exactly representable without rounding errors (unless in corner cases where the mantissa under/overflows, i.e if it is binary <= pow(2,4)
or >= pow(2, 53) - pow(2,4)
. you might want to check for these*).
Then you have two strings, search the first position where the digits differ and cut it off there. Although there are a lot of rounding cases, especially when you not just want a correct number in the range, but the number closes to the input number (but that might not be needed). For example if you get "1.23"
and "1.24"
, you might even want to output `"1.235".
This also shows that your example is wrong. epsilon for 0.569999999992724327
is (to maximal precision) 0.000000000000003552713678800500929355621337890625
. The ranges are 0.569999999992720773889232077635824680328369140625
to 0.569999999992727879316589678637683391571044921875
and would be cut off at 0.569999999992725
(or 0.569999999992723
if you prefer that rounding)
An easier to implement sledgehammer method would be to output it to the maximal precision, cut one digit off, convert it back to double, check if it compares correctly. Then continue cutting, till the comparison fails. (could be improved with a binary search)
* They should still be exactly representable, but your comparison method will behave very odd. Consider num1 == 1
and num2 == 1 - pow(2.0, -53) = 0.99999999999999988897769753748434595763683319091796875
. There difference 0.00000000000000011102230246251565404236316680908203125
is below your epsilon0.000000000000003552713678800500929355621337890625
, but the comparison will say they differ, because they have different exponents
Upvotes: 3
Reputation: 54859
Yes, it's possible.
double a=fmod(4.1,1);
cerr<<std::setprecision(0)<<a<<"\n";
cerr<<std::setprecision(10)<<a<<"\n";
cerr<<std::setprecision(20)<<a<<"\n";
produces:
0.1
0.1
0.099999999999999644729
I think you just need to determine what level of display precision corresponds to your epsilon value.
Upvotes: 1