Reputation: 3260
How come the following test works fine on Windows, but fails on Linux:
import numpy as np
print(f"Numpy Version: {np.__version__}")
# version A
versionA = np.eye(4)
versionA[:3, 3] = 2**32 + 1
versionA = versionA.astype(np.uint32)
# version B
versionB = np.eye(4, dtype=np.uint32)
versionB[:3, 3] = np.asarray(2**32 + 1)
# # version C
# # (raises OverflowError)
# versionC = np.eye(4, dtype=np.uint32)
# versionC[:3, 3] = 2**32 + 1
np.testing.assert_array_equal(versionA, versionB)
I tested this on Windows and Linux with numpy versions:
1.23.4
, 1.21.5
, 1.24.0
. On Windows, the assignment overflows to 1
in both versions and the assertion compares as equal. On Linux, on the other hand, versionB
overflows to 1
, but versionA
results in assigning 0
. As a result, I get the following failure:
AssertionError:
Arrays are not equal
Mismatched elements: 3 / 16 (18.8%)
Max absolute difference: 4294967295
Max relative difference: 4.2949673e+09
x: array([[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]], dtype=uint32)
y: array([[1, 0, 0, 1],
[0, 1, 0, 1],
[0, 0, 1, 1],
[0, 0, 0, 1]], dtype=uint32)
Can someone explain why numpy behaves this way?
Upvotes: 1
Views: 129
Reputation: 50358
Numpy is based on C and so it follows the C conventions. The thing is np.eye(4)
is a float64
number and 2**32 + 1
is larger than what np.uint32
can handle. This means versionA.astype(np.uint32)
will cause a float64 -> uint32
overflow. This is an undefined behaviour in C. Numpy tries to detect errors like this without impacting performance, but AFAIK, this does not track all errors and it looks like an open problem to me. Indeed, from the point-of-view of Numpy developers, either we choose to have a complete deterministic platform-independent behaviour at the expense of possibly expensive checks, or we choose to accept undefined behaviours and report this as an expected behaviour in the documentation. As of now, the problem is not well documented but most errors are detected. Here is the related issue on the subject:
I strongly advise you to check the last version of Numpy (v1.24.0) which should contain the rework doing additional checks (ie. the first linked issue above). If this still not work with it, please fill an issue (with the related links above) so we can discuss about this (long standing) problem.
Upvotes: 2