Reputation: 177
I need to convert the NaN value to an integer in Python (NumPy, specifically), which unfortunately throws an error. For those unfamiliar with the issue, here is a MWE showcasing it:
import numpy as np
test_data = [[2.3, 4], [1.1, np.nan]]
test_array = np.array(test_data, dtype=[("col1", float), ("col2", int)])
Running this code produces the error ValueError: cannot convert float NaN to integer
. There have been questions regarding this previously, most notably here and here, but they only offer workarounds that aren't useful in my situation. Here's some solutions they've given along with a few I've thought of, along with the reason they don't work for me:
So that's where I am. I need to have blank entries in a column of ints, and the only way I'm aware of for floats disagrees with Python. The various workarounds suggested in other answers to questions of this sort are unworkable in my specific case. So how can I get this to work, either by somehow making NaN convert to an int or otherwise inserting a blank int?
Upvotes: 2
Views: 6760
Reputation: 813
A little late to the party, but not sure if is this is what you are looking for, but numpy.nan_to-num should be able to do that.
Using your example, this is what you could do:
test_data = [[2.3, 4], [1.1, np.nan]]
#converts nan to int, default value (0)
np.nan_to_num(x=test_data).astype('int')
array([[2, 4],
[1, 0]])
You could also specify a user-defined value for nan, as in the following example:
# converts nan to user-defined value (10)
np.nan_to_num(x=test_data, nan=10).astype('int')
Upvotes: 1
Reputation: 584
One thing would be to cast it as an array of type object
:
test_array = np.array(test_data, dtype=object)
which will preserve the floats 2.3 and 1.1, keep nan
as a float, but will cast 4
as an integer:
print(test_array)
print([type(val) for row in test_array for val in row])
> [[2.3 4]
[1.1 nan]]
> [<class 'float'>, <class 'int'>, <class 'float'>, <class 'float'>]
If you want all the numbers cast as int
, one thing you can do is cast what you can cast and leave the rest as-is:
array1 = np.array(test_data)
nan_indices = np.isnan(array1)
test_array = np.empty(array1.shape, dtype = object)
test_array[~nan_indices] = array1[~nan_indices].astype(int)
test_array[nan_indices] = np.nan
Then the printouts look like:
> [[2 4]
[1 nan]]
> [<class 'int'>, <class 'int'>, <class 'int'>, <class 'float'>]
Upvotes: 0