Liliana Larson
Liliana Larson

Reputation:

Handling overflow when casting doubles to integers in C

Today, I noticed that when I cast a double that is greater than the maximum possible integer to an integer, I get -2147483648. Similarly, when I cast a double that is less than the minimum possible integer, I also get -2147483648.

Is this behavior defined for all platforms?
What is the best way to detect this under/overflow? Is putting if statements for min and max int before the cast the best solution?

Upvotes: 36

Views: 20969

Answers (10)

Eric Postpischil
Eric Postpischil

Reputation: 222660

Here is C code to test and report whether a double can be converted to int without overflow and, if it can, return the resulting int. I copied it from my answer here. This code takes pains to use behavior defined by the C standard in a variety of C implementations.

This approach uses the definition of floating-point formats in the C standard—as a signed base-b numeral multiplied by a power of b. Knowing the number of digits in the significand (provided by DBL_MANT_DIG) and the exponent limit (provided by DBL_MAX_EXP) allows us to prepare exact double values as end points.

I believe it will work in all conforming C implementations subject to the modest additional requirements stated in the initial comment.

/*  This code demonstrates safe conversion of double to int in which the
    input double is converted to int if and only if it is in the supported
    domain for such conversions (the open interval (INT_MIN-1, INT_MAX+1)).
    If the input is not in range, an error is indicated (by way of an
    auxiliary argument) and no conversion is performed, so all behavior is
    defined.

    There are a few requirements not fully covered by the C standard.  They
    should be uncontroversial and supported by all reasonable C implementations:

        Conversion of an int that is representable in double produces the
        exact value.

        The following operations are exact in floating-point:

            Dividing by the radix of the floating-point format, within its
            range.

            Multiplying by +1 or -1.

            Adding or subtracting two values whose sum or difference is
            representable.

        FLT_RADIX is representable in int.

        DBL_MIN_EXP is not greater than -DBL_MANT_DIG.  (The code can be
        modified to eliminate this requirement.)
*/


#include <float.h>
#include <errno.h>
#include <limits.h>
#include <stdio.h>


/*  These values will be initialized to the greatest double value not greater
    than INT_MAX+1 and the least double value not less than INT_MIN-1.
*/
static double UpperBound, LowerBound;


/*  Return the double of the same sign of x that has the greatest magnitude
    less than x+s, where s is -1 or +1 according to whether x is negative or
    positive.
*/
static double BiggestDouble(int x)
{
    /*  All references to "digits" in this routine refer to digits in base
        FLT_RADIX.  For example, in base 3, 77 would have four digits (2212).

        In this routine, "bigger" and "smaller" refer to magnitude.  (3 is
        greater than -4, but -4 is bigger than 3.)
    */

    //  Determine the sign.
    int s = 0 < x ? +1 : -1;

    //  Count how many digits x has.
    int digits = 0;
    for (int t = x; t; ++digits)
        t /= FLT_RADIX;

    /*  If the double type cannot represent finite numbers this big, return the
        biggest finite number it can hold, with the desired sign.
    */
    if (DBL_MAX_EXP < digits)
        return s*DBL_MAX;

    //  Determine whether x is exactly representable in double.
    if (DBL_MANT_DIG < digits)
    {
        /*  x is not representable, so we will return the next lower
            representable value by removing just as many low digits as
            necessary.  Note that x+s might be representable, but we want to
            return the biggest double less than it, which is also the biggest
            double less than x.
        */

        /*  Figure out how many digits we have to remove to leave at most
            DBL_MANT_DIG digits.
        */
        digits = digits - DBL_MANT_DIG;

        //  Calculate FLT_RADIX to the power of digits.
        int t = 1;
        while (digits--) t *= FLT_RADIX;

        return x / t * t;
    }
    else
    {
        /*  x is representable.  To return the biggest double smaller than
            x+s, we will fill the remaining digits with FLT_RADIX-1.
        */

        //  Figure out how many additional digits double can hold.
        digits = DBL_MANT_DIG - digits;

        /*  Put a 1 in the lowest available digit, then subtract from 1 to set
            each digit to FLT_RADIX-1.  (For example, 1 - .001 = .999.)
        */
        double t = 1;
        while (digits--) t /= FLT_RADIX;
        t = 1-t;

        //  Return the biggest double smaller than x+s.
        return x + s*t;
    }
}


/*  Set up supporting data for DoubleToInt.  This should be called once prior
    to any call to DoubleToInt.
*/
static void InitializeDoubleToInt(void)
{
    UpperBound = BiggestDouble(INT_MAX);
    LowerBound = BiggestDouble(INT_MIN);
}


/*  Perform the conversion.  If the conversion is possible, return the
    converted value and set *error to zero.  Otherwise, return zero and set
    *error to ERANGE.
*/
static int DoubleToInt(double x, int *error)
{
    if (LowerBound <= x && x <= UpperBound)
    {
        *error = 0;
        return x;
    }
    else
    {
        *error = ERANGE;
        return 0;
    }
}


#include <string.h>


static void Test(double x)
{
    int error, y;
    y = DoubleToInt(x, &error);
    printf("%.99g -> %d, %s.\n", x, y, error ? strerror(error) : "No error");
}


#include <math.h>


int main(void)
{
    InitializeDoubleToInt();
    printf("UpperBound = %.99g\n", UpperBound);
    printf("LowerBound = %.99g\n", LowerBound);

    Test(0);
    Test(0x1p31);
    Test(nexttoward(0x1p31, 0));
    Test(-0x1p31-1);
    Test(nexttoward(-0x1p31-1, 0));
}

Upvotes: 0

Dong Wang
Dong Wang

Reputation: 19

We meet the same question. such as:

double d = 9223372036854775807L;
int i = (int)d;

in Linux/window, i = -2147483648. but In AIX 5.3 i = 2147483647.

If the double is outside the range of integer.

  • Linux/window always return INT_MIN.
  • AIX will return INT_MAX if double is postive, will return INT_MIN of double is negetive.

Upvotes: 1

chux
chux

Reputation: 153457

What is the best way to detect this under/overflow?

Compare the truncated double to exact limits near INT_MIN,INT_MAX.

The trick is to exactly convert limits based on INT_MIN,INT_MAX into double values. A double may not exactly represent INT_MAX as the number of bits in an int may exceed that floating point's precision.*1 In that case, the conversion of INT_MAX to double suffers from rounding. The number after INT_MAX is a power-of-2 and is certainly representable as a double. 2.0*(INT_MAX/2 + 1) generates the whole number one greater than INT_MAX.

The same applies to INT_MIN on non-2s-complement machines.

INT_MAX is always a power-of-2 - 1.
INT_MIN is always:
-INT_MAX (not 2's complement) or
-INT_MAX-1 (2's complement)

int double_to_int(double x) {
  x = trunc(x);
  if (x >= 2.0*(INT_MAX/2 + 1)) Handle_Overflow();
  #if -INT_MAX == INT_MIN
  if (x <= 2.0*(INT_MIN/2 - 1)) Handle_Underflow();
  #else

  // Fixed 2022
  // if (x < INT_MIN) Handle_Underflow();
  if (x - INT_MIN < -1.0) Handle_Underflow();

  #endif
  return (int) x;
}

To detect NaN and not use trunc()

#define DBL_INT_MAXP1 (2.0*(INT_MAX/2+1)) 
#define DBL_INT_MINM1 (2.0*(INT_MIN/2-1)) 

int double_to_int(double x) {
  if (x < DBL_INT_MAXP1) {
    #if -INT_MAX == INT_MIN
    if (x > DBL_INT_MINM1) {
      return (int) x;
    }
    #else
    if (ceil(x) >= INT_MIN) {
      return (int) x;
    }
    #endif 
    Handle_Underflow();
  } else if (x > 0) {
    Handle_Overflow();
  } else {
    Handle_NaN();
  }
}

[Edit 2022] Corner error corrected after 6 years.

double values in the range (INT_MIN - 1.0 ... INT_MIN) (non-inclusive end-points) convert well to int. Prior code failed those.


*1 This applies too to INT_MIN - 1 when int precision is more than double. Although this is rare, the issues readily applies to long long. Consider the difference between:

  if (x < LLONG_MIN - 1.0) Handle_Underflow(); // Bad
  if (x - LLONG_MIN < -1.0) Handle_Underflow();// Good

With 2's complement, some_int_type_MIN is a (negative) power-of-2 and exactly converts to a double. Thus x - LLONG_MIN is exact in the range of concern while LLONG_MIN - 1.0 may suffer precision loss in the subtraction.

Upvotes: 4

nwellnhof
nwellnhof

Reputation: 33618

When casting floats to integers, overflow causes undefined behavior. From the C99 spec, section 6.3.1.4 Real floating and integer:

When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.

You have to check the range manually, but don't use code like:

// DON'T use code like this!
if (my_double > INT_MAX || my_double < INT_MIN)
    printf("Overflow!");

INT_MAX is an integer constant that may not have an exact floating-point representation. When comparing to a float, it may be rounded to the nearest higher or nearest lower representable floating point value (this is implementation-defined). With 64-bit integers, for example, INT_MAX is 2^63 - 1 which will typically be rounded to 2^63, so the check essentially becomes my_double > INT_MAX + 1. This won't detect an overflow if my_double equals 2^63.

For example with gcc 4.9.1 on Linux, the following program

#include <math.h>
#include <stdint.h>
#include <stdio.h>

int main() {
    double  d = pow(2, 63);
    int64_t i = INT64_MAX;
    printf("%f > %lld is %s\n", d, i, d > i ? "true" : "false");
    return 0;
}

prints

9223372036854775808.000000 > 9223372036854775807 is false

It's hard to get this right if you don't know the limits and internal representation of the integer and double types beforehand. But if you convert from double to int64_t, for example, you can use floating point constants that are exact doubles (assuming two's complement and IEEE doubles):

if (!(my_double >= -9223372036854775808.0   // -2^63
   && my_double <   9223372036854775808.0)  // 2^63
) {
    // Handle overflow.
}

The construct !(A && B)also handles NaNs correctly. A portable, safe, but slighty inaccurate version for ints is:

if (!(my_double > INT_MIN && my_double < INT_MAX)) {
    // Handle overflow.
}

This errs on the side of caution and will falsely reject values that equal INT_MIN or INT_MAX. But for most applications, this should be fine.

Upvotes: 29

qrdl
qrdl

Reputation: 34968

limits.h has constants for max and min possible values for integer data types, you can check your double variable before casting, like

if (my_double > nextafter(INT_MAX, 0) || my_double < nextafter(INT_MIN, 0))
    printf("Overflow!");
else
    my_int = (int)my_double;

EDIT: nextafter() will solve the problem mentioned by nwellnhof

Upvotes: 14

fhe
fhe

Reputation: 6187

Another option is to use boost::numeric_cast which allows for arbitrary conversion between numerical types. It detects loss of range when a numeric type is converted, and throws an exception if the range cannot be preserved.

The website referenced above also provides a small example which should give a quick overview on how this template can be used.

Of course, this isn't plain C anymore ;-)

Upvotes: 2

Nils Pipenbrinck
Nils Pipenbrinck

Reputation: 86343

To answer your question: The behaviour when you cast out of range floats is undefined or implementation specific.

Speaking from experience: I've worked on a MIPS64 system that didn't implemented these kind of casts at all. Instead of doing something deterministic the CPU threw a CPU exception. The exception handler that ought to emulate the cast returned without doing anything to the result.

I've ended up with random integers. Guess how long it took to trace back a bug to this cause. :-)

You'll better do the range check yourself if you aren't sure that the number can't get out of the valid range.

Upvotes: 13

JaredPar
JaredPar

Reputation: 754715

A portable way for C++ is to use the SafeInt class:

http://www.codeplex.com/SafeInt

The implementation will allow for normal addition/subtract/etc on a C++ number type including casts. It will throw an exception whenever and overflow scenario is detected.

SafeInt<int> s1 = INT_MAX;
SafeInt<int> s2 = 42;
SafeInt<int> s3 = s1 + s2;  // throws

I highly advise using this class in any place where overflow is an important scenario. It makes it very difficult to avoid silently overflowing. In cases where there is a recovery scenario for an overflow, simply catch the SafeIntException and recover as appropriate.

SafeInt now works on GCC as well as Visual Studio

Upvotes: 4

Jeff Barger
Jeff Barger

Reputation: 1239

I can't tell you for certain whether it is defined for all platforms, but that is pretty much what's happened on every platform I've used. Except, in my experience, it rolls. That is, if the value of the double is INT_MAX + 2, then when the result of the cast ends up being INT_MIN + 2.

As for the best way to handle it, I'm really not sure. I've run up against the issue myself, and have yet to find an elegant way to deal with it. I'm sure someone will respond that can help us both there.

Upvotes: -2

Sandeep Datta
Sandeep Datta

Reputation: 29335

I am not sure about this but I think it may be possible to "turn on" floating point exceptions for under/overflow...take a look at this Dealing with Floating-point Exceptions in MSVC7\8 so you might have an alternative to if/else checks.

Upvotes: -1

Related Questions