Reputation: 21292
Is there a defined constant in the C++ standard, for largest sequential integer that can be stored in a float without approximation?
(and also double, and long double?)
Is this directly related to the number of mantissa bits/significand bits?
If so, would the max sequential integer be exactly (1 << mantissaBitCount) - 1
?
Is there a defined constant in the C++ standard, for the mantissa bit count for float, double, and long double?
Upvotes: 2
Views: 225
Reputation: 21292
Here's a solution I wrote using recursive template metaprogramming.
This uses the pow
-based formula from eerorika's answer here, while hopefully avoiding potential problems caused by poor implementations of pow
.
The recursive template metaprogramming is explained here.
This solution is valuable, because it guarantees constexpr
results.
//recursive template, which references itself
template <class result_T, result_T B, uint64_t E>
constexpr result_T const pow_ = B * pow_<result_T, B, E - 1>;
//"base case" template (partial specialization),
//which ends the recursion so it doesn't loop forever
template <class result_T, result_T B>
constexpr result_T const pow_<result_T, B, 0> = static_cast<result_T>(1);
//friendly constants for max sequential integer values, for floating-point types
constexpr uint32_t const FLOAT_MAX_SEQ_INT =
pow_<uint32_t, numeric_limits<float>::radix, numeric_limits<float>::digits>;
constexpr uint64_t const DOUBLE_MAX_SEQ_INT =
pow_<uint64_t, numeric_limits<double>::radix, numeric_limits<double>::digits>;
constexpr long double const LONG_DOUBLE_MAX_SEQ_INT =
pow_<long double, numeric_limits<long double>::radix, numeric_limits<long double>::digits>;
Upvotes: 0
Reputation: 222536
The integer before the first unrepresentable integer in type T
may be calculated as either of:
std::numeric_limits<T>::radix / std::numeric_limits<T>::epsilon()
, orstd::scalbn(1, std::numeric_limits<T>::digits)
.This is because that integer is b p, where b is the radix used for the floating-point format, and p is the precision, the number of digits in the significand. This is because if all the digits of the significand are at their maximum value (b−1), and the exponent is such that it just scales the significand to an integer, the value represented is b p−1 (such as 999 for three digits in base ten). The next integer, b p is also obviously representable, with significand 1 and exponent p. The integer after that is not, as, in base b, it would have a 1 followed by p−1 0s followed by a 1, which does not fit in a significand of p digits.
Regarding the second expression: The scalbn
function, declared in <cmath>
, multiplies its first operand by b to the power of the second operand. std::numeric_limits<T>::digits
is the p described above, so that scalbn
expression produces b p.
Regarding the first expression: std::numeric_limits<T>::radix
is b, and std::numeric_limits<T>::epsilon()
is the position value of the lowest position in the significand for the floating-point number 1, so it is b p−1, so dividing these produces b p.
Upvotes: 2
Reputation: 238321
Is there a defined constant in the C++ standard, for the mantissa bit count for float, double, and long double?
Yes. They are in the std::numeric_limits
template. There are also macros for these inherited from the C standard library.
Is [max sequential integer] directly related to the number of mantissa bits/significand bits?
Yes.
If so, would the max sequential integer be exactly
(1 << mantissaBitCount) - 1
?
Not quite. Remove the -1 and this would be true for representations that use radix of 2 (which is common). Note that mantissaBitCount
is assumed to be the number of bits in the true significand rather than number of bits in the memory i.e. in case of IEEE-754 it includes the implicit leading bit.
Defined constant for max sequential integer that can be stored in a float?
There is no such constant, but it can be calculated using the provided constants:
using T = float; // as an example; works with other types too (even integers)
using limits = std::numeric_limits<T>;
constexpr int digits = limits::digits;
constexpr int radix = limits::radix;
constexpr T max_conseq = std::pow(radix, T(digits));
Upvotes: 1