mchangun
mchangun

Reputation: 10322

Numpy float64 vs Python float

I'm battling some floating point problems in Pandas read_csv function. In my investigation, I found this:

In [15]: a = 5.9975

In [16]: a
Out[16]: 5.9975

In [17]: np.float64(a)
Out[17]: 5.9974999999999996

Why is builtin float of Python and the np.float64 type from Python giving different results? I thought they were both C++ doubles?

Upvotes: 67

Views: 150748

Answers (2)

cottontail
cottontail

Reputation: 23041

Numpy float64 dtype inherits from Python float, which implements C double internally. You can verify that as follows:

isinstance(np.float64(5.9975), float)   # True

So even if their string representation is different, the values they store are the same.

On the other hand, np.float32 implements C float (which has no analog in pure Python) and no numpy int dtype (np.int32, np.int64 etc.) inherits from Python int because in Python 3 int is unbounded:

isinstance(np.float32(5.9975), float)   # False
isinstance(np.int32(1), int)            # False

So why define np.float64 at all?

np.float64 defines most of the attributes and methods in np.ndarray. From the following code, you can see that np.float64 implements all but 4 methods of np.array:

[m for m in set(dir(np.array([]))) - set(dir(np.float64())) if not m.startswith("_")]

# ['argpartition', 'ctypes', 'partition', 'dot']

So if you have a function that expects to use ndarray methods, you can pass np.float64 to it while float doesn't give you the same.

For example:

def my_cool_function(x):
    return x.sum()

my_cool_function(np.array([1.5, 2]))   # <--- OK
my_cool_function(np.float64(5.9975))   # <--- OK
my_cool_function(5.9975)               # <--- AttributeError

Upvotes: 1

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 798526

>>> numpy.float64(5.9975).hex()
'0x1.7fd70a3d70a3dp+2'
>>> (5.9975).hex()
'0x1.7fd70a3d70a3dp+2'

They are the same number. What differs is their representation; the Python native type uses a "sane" representation, and the NumPy type uses an accurate representation.

Upvotes: 67

Related Questions