Reputation: 31
I am using np.absolute to try to find datasets that have meaningful data. The data includes
negative numbers and I am trying to use the mean to decide wether it is meaningful. However, I don't want the negative values to cancel out the positive ones and give me a mean close to zero. Hence I use np.absolute, and take the mean of that. My code wasn't finding some datasets to be meaningful that I knew were, so I did some troublshooting, and found that np.absolute() was returning negative values! How can this be? Here is a shortened version of the program. I tested it and it has the same problem as the full version.
import h5py
import numpy as np
with h5py.File('<filename>.h5', 'r') as hdf:
timestream_data = hdf.get('timestream')
timestream = np.array(timestream_data)
for x in timestream:
timestream_list = timestream.tolist()
dataset_as_list = x.tolist()
print("Index: %d" % timestream_list.index(dataset_as_list), x, np.absolute(x))
And the output (I'll only include one of the lines that shows the error): Index: 32 [ 127 -128 -128 ... -128 -128 127] [ 127 -128 -128 ... -128 -128 127]
Notice the last thing it should be printing on that line is np.absolute(x)... but there are negative values in the array...
Upvotes: 2
Views: 942
Reputation: 25
Numpy absolute
tries to conserve the dtype
. As you're using 8 bit integers, that means you have a minimum value of -128
and a maximum value of 127
. This is a result of Two's Complement way to store integers used by numpy. Since there's no posive equivalent to -128
it seems to be ignored when paired with numpy.absolute
due to an overflow.
In Numpy version 2 this problem has been mostly solved, since absolute
will automatically convert integers to a dtype
with more bits whenever an overflow occurs. This issue still persists for int64
, tho, since its the largest Numpy integer.
import numpy as np
print(np.__version__)
int32_min = np.iinfo(np.int32).min # -2147483648
# Prints -2147483648/-2147483648 in Numpy 1.x and -2147483648/2147483648 in Numpy 2.x
print(f"Minimum value: {int32_min}; Absolute of minimum value: {np.abs(int32_min)}")
# Prints int32 in Numpy 1.x and int64 in Numpy 2.x
print(np.absolute(int32_min).dtype)
# Raises AssertionError in Numpy 1.x but not in Numpy 2.x
assert np.abs(int32_min) == -int32_min
int64_min = np.iinfo(np.int64).min # -9223372036854775808
# Raises an error both in Numpy 1.x as well as in Numpy 2.x
assert np.abs(np.iinfo(np.int64).min) == -int64_min
Upvotes: 0