Reputation: 3753
I have a numpy array like this below:
array([['18.0', '11.0', '5.0', ..., '19.0', '18.0', '20.0'],
['11.0', '14.0', '15.0', ..., '45.0', '26.0', '20.0'],
['1.0', '0.0', '1.0', ..., '3.0', '4.0', '17.0'],
...,
['nan', 'nan', 'nan', ..., 'nan', 'nan', 'nan'],
['nan', 'nan', 'nan', ..., 'nan', 'nan', 'nan'],
['nan', 'nan', 'nan', ..., 'nan', 'nan', 'nan']],
dtype='|S230')
But converting it to int array makes the np.nan value to be weird values:
df[:,4:].astype('float').astype('int')
array([[ 18, 11, 5,
..., 19, 18,
20],
[ 11, 14, 15,
..., 45, 26,
20],
[ 1, 0, 1,
..., 3, 4,
17],
...,
[-9223372036854775808, -9223372036854775808, -9223372036854775808,
..., -9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808, -9223372036854775808,
..., -9223372036854775808, -9223372036854775808,
-9223372036854775808],
[-9223372036854775808, -9223372036854775808, -9223372036854775808,
..., -9223372036854775808, -9223372036854775808,
-9223372036854775808]])
So how to fix my problem ?
Upvotes: 2
Views: 1775
Reputation: 11972
It all depends what you expect the result to be. nan
is of a float type, so converting the string 'nan'
into float is no problem. But there is no definition of converting it to int
values.
I suggest you handle it differently - first choose what spcific int
you want all the nan
values to become (for example 0), and only then convert the whole array to int
a = np.array(['1','2','3','nan','nan'])
a[a=='nan'] = 0 # this will convert all the nan values to 0, or choose another number
a = a.astype('int')
Now a
is equal to
array([1, 2, 3, 0, 0])
Upvotes: 1
Reputation: 96324
Converting floating-point Nan
to an integer type is undefined behavior, as far as I know. The number:
-9223372036854775808
Is the smallest int64, i.e. -2**63
. Note the same thing happens on my system when I coerce to int32
:
>>> arr
array([['18.0', '11.0', '5.0', 'nan']],
dtype='<U4')
>>> arr.astype('float').astype(np.int32)
array([[ 18, 11, 5, -2147483648]], dtype=int32)
>>> -2**31
-2147483648
Upvotes: 2