Reputation: 16660

Calculations with long double in clang – Compiler bug?

Is this a bug in clang?

This prints out the maximum double value:

long double a = DBL_MAX;
printf("%Lf\n", a);

It is:

179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368.000000

This prints out the maximum long double value:

long double a = LDBL_MAX;
printf("%Lf\n", a);

It is:

/* … bigger, but not displayed here. For a good reason. ;-) */

This is quite clear.

But when I use an arithmetic expression, that is compile time computable as an initializer, I get a surprising result:

long double a = 1.L + DBL_MAX + 1.L;
printf("%Lf\n", a);

This still prints out DBL_MAX and not DBL_MAX + 2!?

It is the same, if the computation is done at runtime:

long double b = 2.L;
long double a = DBL_MAX;
printf("%Lf\n", a+b);

Still DBL_MAX.

$ clang --version
Apple clang version 4.1 (tags/Apple/clang-421.11.66) (based on LLVM 3.1svn)
Target: x86_64-apple-darwin12.4.0
Thread model: posix

Upvotes: 3

Answers (4)

AnT stands with Russia

Reputation: 320719

The IEE754 floating-point double has mantissa of 53 bits wide (52 physical + 1 implicit bit). That means that double can accurately represent contiguous integers in -2^53...+2^53 range (i.e. from -9007199254740992 to +9007199254740992). After that, the type can no longer represent contiguous integers precisely. Instead, the type can represent only even integer values. Any odd value will be rounded to an adjacent even value in accordance with some implementation-specific rules. So, it is perfectly expected that adding 1 to 9007199254740992 within double might result in nothing due to rounding. Starting from that limit you'll have to add at least 2 to see the change in the value (until you reach the point where adding 2 will cease to have any effect either and you'll have to add at least 4, and so on).

The same logic applies to long double, if it is larger than double on your platform. On x86 long double might refer to hardware 80-bit floating-point type with 64-bit mantissa. It means that even with that type your range for precise representation of contiguous integers is limited to a mere -2^64...+2^64.

The value of DBL_MAX is far, FAR, FAAAAR! outside that range. Which means that trying to add 1 to DBL_MAX will not have any effect on the value. Adding 2 will not have any effect either. Neither will 4, nor 1024, nor even 4294967296. You have to add something in 2^960 area (actually nextafter(2^959)) in order to make an impact on a DBL_MAX value stored in a 80-bit long double format.

Upvotes: 10

Andrew W

Reputation: 4618

A not quite technically correct answer that hopefully helps:

The number is represented by a sign, an exponent, and a fraction.

On this page, information about the C data types is given (https://en.wikipedia.org/wiki/C_data_types). The chart claims that long double is not guaranteed to be a "larger" data type than double; however, since C99 this is guaranteed if it exists on the target architecture (Annex F IEC 60559 floating-point arithmetic). Your results from DBL_MAX and LDBL_MAX show that on your implementation it does in fact use more bits.

So here's what's happening:

you have a number in the following format:

in double that would be

<1 bit><11 bits><52 bits>

in long, you have this 80 bit representation (https://en.wikipedia.org/wiki/Extended_precision)

<1 bit><15 bits><64 bits>

You can fit the double type into the long double type so this causes no problems. However, notice that the decimal point is "floating" (hence the name) not all digits in the number are represented. The computer represents the most significant digits, and then and exponent (so it would be like me writing 1234567 E 234 for example, notice that I'm not writing all 234 digits of that number). When you try to add 1 to this, the digit in the one's place is not being represented (due to the size of the exponent), so this will be ignored after rounding.

For more details, read up on floating point here (https://en.wikipedia.org/wiki/Double_precision_floating-point_format)

Upvotes: 3

Stephen Canon

Reputation: 106317

Not a bug. long double in clang/x86_64 has 64 bits of precision, and results are rounded to fit in that format.

This will all be clearer if we use hex instead of binary. DBL_MAX is:

0xfffffffffffff800000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

The exact mathematical result of 1.L + DBL_MAX is therefore:

0xfffffffffffff800000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001

... but that is not representable as a long double, so the computed result is rounded to the closest representable long double, which is just DBL_MAX; adding 1 does not (and should not) change the value.

(It rounds down instead of up because the next larger representable number is

0xfffffffffffff801000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

which is much farther away from the mathematically precise result than DBL_MAX is).

Upvotes: 9

Dietrich Epp

Reputation: 213768

This is expected behavior.

long double a = 1.L + DBL_MAX + 1.L;

The long double type is floating point: it has a finite amount of precision. The result of most operations is rounded to the nearest representable value.

See What Every Programmer Should Know About Floating-Point Arithmetic.

Upvotes: 5

Calculations with long double in clang – Compiler bug?

Answers (4)

Related Questions