Reputation: 1420
In C++, the conversion of an integer value of type I
to a floating point type F
will be exact — as static_cast<I>(static_cast<F>(i)) == i
— if the range of I
is a part of the range of integral values of F
.
Is it possible, and if yes how, to calculate the loss of precision of static_cast<F>(i)
(without using another floating point type with a wider range)?
As a start, I tried to code a function that would return if a conversion is safe or not (safe, meaning no loss of precision), but I must admit I am not so sure about its correctness.
template <class F, class I>
bool is_cast_safe(I value)
{
return std::abs(alue) < std::numeric_limits<F>::digits;
}
std::cout << is_cast_safe<float>(4) << std::endl; // true
std::cout << is_cast_safe<float>(0x1000001) << std::endl; // false
Thanks in advance.
Upvotes: 0
Views: 1507
Reputation: 222372
is_cast_safe
can be implemented with:
static const F One = 1;
F ULP = std::scalbn(One, std::ilogb(value) - std::numeric_limits<F>::digits + 1);
I U = std::max(ULP, One);
return value % U;
This sets ULP
to the value of the least digit position in the result of converting value
to F
. ilogb
returns the position (as an exponent of the floating-point radix) for the highest digit position, and subtracting one less than the number of digits adjusts to the lowest digit position. Then scalbn
gives us the value of that position, which is the ULP.
Then value
can be represented exactly in F
if and only if it is a multiple of the ULP. To test that, we convert the ULP to I
(but substitute 1 if it is less than 1), and then take the remainder of value
divided by the ULP (or 1).
Also, if one is concerned the conversion to F
might overflow, code can be inserted to handle this as well.
Calculating the actual amount of the change is trickier. The conversion to floating-point could round up or down, and the rule for choosing is implementation-defined, although round-to-nearest-ties-to-even is common. So the actual change cannot be calculated from the floating-point properties we are given in numeric_limits
. It must involve performing the conversion and doing some work in floating-point. This definitely can be done, but it is a nuisance. I think an approach that should work is:
value
is non-negative. (Negative values can be handled similarly but are omitted for now for simplicity.)F
. This in itself is tricky, as the behavior is undefined if the value is too large. Some similar considerations were addressed in this answer to a question about safely converting from floating-point to integer (in C).x
. Divide x
by the floating-point radix r
, producing y
. If y
is not an integer (which can be tested using fmod
or trunc
) the conversion was exact.y
to I
, producing z
. This is safe because y
is less than the original value
, so it must fit in I
.(z-value/r)*r + value%r
.Upvotes: 1
Reputation: 29952
I loss = abs(static_cast<I>(static_cast<F>(i))-i)
should do the job. The only exception if i
's magnitude is large, so static_cast<F>(i)
would generate an out-of-I-range F
.
(I supposed here that I abs(I)
is available)
Upvotes: 0