Reputation: 42710
What's the difference between Double.MIN_NORMAL
(introduced in Java 1.6) and Double.MIN_VALUE
?
Upvotes: 53
Views: 12158
Reputation: 89643
IEEE-754 binary64 format:
s_eee_eeee_eeee_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm_mmmm
(1 s
; 3×4−1 =11 e
s; 64−3×4 =52 m
s)
, and its algorithm:
If e >000_0000_0000
and <111_1111_1111
: interpret as (-1)
s ×2e−balancer:1023 ×(
base:1 +m×2−sub-one-pusher:52)
. (These are the normal numbers.)
If e =000_0000_0000
: do the same (as line above) except base:1
is base:0
, and e
is e +1
. (These are the subnormal numbers, except for zero which is neither subnormal/normal.)
If e =111_1111_1111
and m =0000...0000
: interpret as (-1)
s × infinity.
If e =111_1111_1111
and m <>0000...0000
: interpret as NaN. (Btwbtw: therefore there're 2× (
252 −1)
different bit representations for NaN, cf #Quiet NaN &doubleToRawLongBits
.)
Thus:
The smallest of its possible positive numbers is 0_000_0000_0000_0000_..._0001
(Double.MIN_VALUE
(also .NET's Double.Epsilon
)) (a subnormal number).
The smallest of its possible positive normal numbers is 0_000_0000_0001_0000_..._0000
(Double.MIN_NORMAL
).
MIN_VALUE
computation:
(-1)s:0 ×2(e:0+1)−balancer:1023 ×(base:0 +m:1 ×2−sub-one-pusher:52)
= 1 ×2−1022 ×2−52
= 2−1074 (~4.94 × 10−324)
, and MIN_NORMAL
computation:
(-1)s:0 ×2e:1 −balancer:1023 ×(base:1 +m:0 ×2−sub-one-pusher:52)
= 1 ×2−1022 ×1
= 2−1022 (~2.225 × 10−308)
Upvotes: 14
Reputation: 421030
The answer can be found in the IEEE specification of floating point representation:
For the single format, the difference between a normal number and a subnormal number is that the leading bit of the significand (the bit to left of the binary point) of a normal number is 1, whereas the leading bit of the significand of a subnormal number is 0. Single-format subnormal numbers were called single-format denormalized numbers in IEEE Standard 754.
In other words, Double.MIN_NORMAL
is the smallest possible number you can represent, provided that you have a 1 in front of the binary point (what is referred to as decimal point in a decimal system). While Double.MIN_VALUE
is basically the smallest number you can represent without this constraint.
Upvotes: 36
Reputation: 3921
For simplicity, the explanation will consider just the positive numbers.
The maximum spacing between two adjacent normalized floating point numbers 'x1' and 'x2' is 2 * epsilon * x1
(the normalized floating point numbers are not evenly spaced, they are logarithmically spaced). That means, that when a real number (i.e. the "mathematical" number) is rounded to a floating point number, the maximum relative error is epsilon
, which is a constant called machine epsilon or unit roundoff, and for double precision it has the value 2^-52 (approximate value 2.22e-16).
The floating point numbers smaller than Double.MIN_NORMAL
are called subnormals, and they are evenly filling the gap between 0 and Double.MIN_NORMAL
. That means that the computations involving subnormals can lead to less accurate results. Using subnormals allows a calculation to lose precision more slowly when the result is small.
Upvotes: 3