Adam Mills
Adam Mills

Reputation: 31

numpy.absolute returns negative numbers

I am using np.absolute to try to find datasets that have meaningful data. The data includes
negative numbers and I am trying to use the mean to decide wether it is meaningful. However, I don't want the negative values to cancel out the positive ones and give me a mean close to zero. Hence I use np.absolute, and take the mean of that. My code wasn't finding some datasets to be meaningful that I knew were, so I did some troublshooting, and found that np.absolute() was returning negative values! How can this be? Here is a shortened version of the program. I tested it and it has the same problem as the full version.

import h5py
import numpy as np


with h5py.File('<filename>.h5', 'r') as hdf:
    timestream_data = hdf.get('timestream')
    timestream = np.array(timestream_data)
    for x in timestream:
        timestream_list = timestream.tolist()
        dataset_as_list = x.tolist()
        print("Index: %d" % timestream_list.index(dataset_as_list), x, np.absolute(x)) 

And the output (I'll only include one of the lines that shows the error): Index: 32 [ 127 -128 -128 ... -128 -128 127] [ 127 -128 -128 ... -128 -128 127]

Notice the last thing it should be printing on that line is np.absolute(x)... but there are negative values in the array...

Upvotes: 2

Views: 942

Answers (1)

Renan Elfo
Renan Elfo

Reputation: 25

Numpy absolute tries to conserve the dtype. As you're using 8 bit integers, that means you have a minimum value of -128 and a maximum value of 127. This is a result of Two's Complement way to store integers used by numpy. Since there's no posive equivalent to -128 it seems to be ignored when paired with numpy.absolute due to an overflow.

In Numpy version 2 this problem has been mostly solved, since absolute will automatically convert integers to a dtype with more bits whenever an overflow occurs. This issue still persists for int64, tho, since its the largest Numpy integer.

import numpy as np

print(np.__version__)

int32_min = np.iinfo(np.int32).min  # -2147483648

# Prints -2147483648/-2147483648 in Numpy 1.x and -2147483648/2147483648 in Numpy 2.x
print(f"Minimum value: {int32_min}; Absolute of minimum value: {np.abs(int32_min)}")

# Prints int32 in Numpy 1.x and int64 in Numpy 2.x
print(np.absolute(int32_min).dtype)

# Raises AssertionError in Numpy 1.x but not in Numpy 2.x
assert np.abs(int32_min) == -int32_min

int64_min = np.iinfo(np.int64).min  # -9223372036854775808

# Raises an error both in Numpy 1.x as well as in Numpy 2.x
assert np.abs(np.iinfo(np.int64).min) == -int64_min

Upvotes: 0

Related Questions