Reputation: 632
I would like to convert a pandas data frame to a numpy array. I usually use df.to_numpy()
, which allows to specify data type (float
or int
in my case). However some of the values are not numbers, and I would like to coerce them to np.nan
. I cannot use a simple replacement rule, since the not number values are given by different symbols, which I do not know in advance. (It is basically about cleaning a dataset).
Upvotes: 0
Views: 452
Reputation: 30589
You can first convert to numeric with 'coerce'
. This will set anything that can't be converted to np.nan
.
pd.to_numeric(pd.Series([1,'a', '1.1.2', 1.1]),'coerce').to_numpy()
#array([1. , nan, nan, 1.1])
Upvotes: 1