Reputation: 1801
If I'm working with a double
, and I convert it to a float
, how does this work exactly? Does the value get truncated so it fits into a float? Or does the value get rounded differently? Sorry if this sounds a bit remedial, but I'm trying to grasp the concept of float
and double
conversions.
Upvotes: 21
Views: 29447
Reputation: 81115
I would suggest that floating-point types are most usefully regarded as representing ranges of values. The reason that 0.1f displays as 0.1 rather than as 0.100000001490116119384765625 is that it really represents the range of numbers from 13421772.5/134217728 to 13421773.5/134217728 (i.e. from 0.0999999977648258209228515625 to 0.1000000052154064178466796875); it wouldn't make sense to add extra digits indicating the number is greater than 0.100 when it might be less, nor to use a string of nines indicating the number is less than 0.100 when it might be greater.
Casting a double to a float will select the float whose range of values includes the range of doubles represented by the double. Note that while this operation is non-reversible, the result of the operation will generally be arithmetically correct; the only time it would not be 100% arithmetically correct would be if one were casting to float a double whose range was precisely centered on the boundary between two floats. In that situation, the system would select the float on one side or the other of the double's range; if the double in fact represented a number on the wrong side of the range, the resulting conversion would be slightly inaccurate.
In practice, the tiny imprecision mentioned above is almost never relevant, because the "range of values" represented by a floating-point type is in practice a little larger than indicated above. Performing a calculation (such as addition) on two numbers that have a certain amount of uncertainty will yield a result with more uncertainty, but the system won't keep track of how much uncertainty exists. Nonetheless, unless one performs dozens of operations on a float, or thousands of operations on a double, the amount of uncertainty will usually be small enough not to worry about.
It's important to note that casting a float to a double is actually far more dangerous operation than casting double to float, even though Java allows the former implicitly without a warning but squawks at the latter. Casting a float to a double causes the system to select the double whose range is centered about the center of the float's range. This will almost always result in a value whose actual uncertainty is far greater than would be typical of double-precision numbers. For example, if one casts 0.1f to double, the resulting double will represent a number in the range 0.10000000149011611 to 0.10000000149011613, even though the number it's supposed to be representing (one tenth) is, relatively speaking, nowhere near that range.
Upvotes: 9
Reputation: 272457
From the Java Language Specification, section 5.1.3:
A narrowing primitive conversion from double to float is governed by the IEEE 754 rounding rules (§4.2.4). This conversion can lose precision, but also lose range, resulting in a float zero from a nonzero double and a float infinity from a finite double. A double NaN is converted to a float NaN and a double infinity is converted to the same-signed float infinity.
and section 4.2.4 says:
The Java programming language requires that floating-point arithmetic behave as if every floating-point operator rounded its floating-point result to the result precision. Inexact results must be rounded to the representable value nearest to the infinitely precise result; if the two nearest representable values are equally near, the one with its least significant bit zero is chosen. This is the IEEE 754 standard's default rounding mode known as round to nearest.
Upvotes: 19