How to safely round-and-clamp from float64 to int64?

Question

This question is about python/numpy, but it may apply to other languages as well.

How can the following code be improved to safely clamp large float values to the maximum int64 value during conversion? (Ideally, it should still be efficient.)

import numpy as np

def int64_from_clipped_float64(x, dtype=np.int64):
  x = np.round(x)
  x = np.clip(x, np.iinfo(dtype).min, np.iinfo(dtype).max)
  # The problem is that np.iinfo(dtype).max is imprecisely approximated as a
  # float64, and the approximation leads to overflow in the conversion.
  return x.astype(dtype)

for x in [-3.6, 0.4, 1.7, 1e18, 1e25]:
  x = np.array(x, dtype=np.float64)
  print(f'x = {x:<10}  result = {int64_from_clipped_float64(x)}')

# x = -3.6        result = -4
# x = 0.4         result = 0
# x = 1.7         result = 2
# x = 1e+18       result = 1000000000000000000
# x = 1e+25       result = -9223372036854775808

orlp · Accepted Answer

The problem is that the largest np.int64 is 2⁶³ - 1, which is not representable in floating point. The same issue doesn't happen on the other end, because -2⁶³ is exactly representable.

So do the clipping half in float space (for detection) and in integer space (for correction):

def int64_from_clipped_float64(x, dtype=np.int64):
    assert x.dtype == np.float64

    limits = np.iinfo(dtype)
    too_small = x <= np.float64(limits.min)
    too_large = x >= np.float64(limits.max)
    ix = x.astype(dtype)
    ix[too_small] = limits.min
    ix[too_large] = limits.max
    return ix

How to safely round-and-clamp from float64 to int64?

Answers (2)

Related Questions