Eugene Trofimov
Eugene Trofimov

Reputation: 178

java Float.MAX_VALUE to Double

It is this code:

public class Main {
    public static void main(String[] args) {
      float a = Float.MAX_VALUE;
      double b = (double) a;
      b++;
      System.out.println(b == a);
}

and it prints true. Could anyone explain why?

Upvotes: 2

Views: 950

Answers (1)

Eric Postpischil
Eric Postpischil

Reputation: 224596

The precision of double is not capable of representing the difference between Float.MAX_VALUE and Float.MAX_VALUE+1, so a rounded result is returned. That rounded result is Float.MAX_VALUE.

Float.MAX_VALUE is 2128−2104. (Note that this is 2127+2126+2125+…+2104. That is, it is the sum of all powers of two from 2127 to 2104. In binary, it has 24 one bits, which is the number of bits in the significand1 of a float. Mathematically, it equals 2128−2104.)

When you add one to this, the mathematical result is of course 2128−2104+1. This is not representable in double, because the significand of a double is 53 bits, but from 2127 to 1 is 129 bits. You cannot fit bits for both 2127 and 1 inside the significand of a double. When a result is not representable, the nearest representable number is returned.

The representable number just below the mathematical result is 2128−2104, and the representable number just above the mathematical result is 2128−2104+275. (Note that from 2127 to 275 is 52 bits, so 275 is the smallest power of 2 that bits in a 53-bit significand where the largest bit is being scaled to 2127. Thus, we calculated this next number above 2128−2104 by adding the smallest amount to it that fits in the significand.) So we have two candidates:

  • 2128−2104, which is 1 away from 2128−2104+1.
  • 2128−2104+275, which is 2104+275−1 away from 2128−2104+1.

The former is closer, so it is chosen to be the computed result. Thus, in double, adding one to 2128−2104 produces 2128−2104.

Footnote

1 The representation of a binary floating-pont number has three parts: a sign s that is +1 or −1, a significand f that is a fixed-point number with a fixed number of bits, and an exponent e, such that the number represented is sf • 2e. The significand can be thought of just as an integer with a certain number of bits, but it is often scaled by adjusting the exponent so that the significand of normal floating-point numbers is in [1, 2). For example, 132 could be thought of as the significand 1000012 times 22 or as 1.000012 times 27.

Upvotes: 2

Related Questions