Roger V.
Roger V.

Reputation: 632

Covert pandas data frame to numpy array, inserting np.nan when necessary

I would like to convert a pandas data frame to a numpy array. I usually use df.to_numpy(), which allows to specify data type (float or int in my case). However some of the values are not numbers, and I would like to coerce them to np.nan. I cannot use a simple replacement rule, since the not number values are given by different symbols, which I do not know in advance. (It is basically about cleaning a dataset).

Upvotes: 0

Views: 452

Answers (1)

Stef
Stef

Reputation: 30589

You can first convert to numeric with 'coerce'. This will set anything that can't be converted to np.nan.

pd.to_numeric(pd.Series([1,'a', '1.1.2', 1.1]),'coerce').to_numpy()
#array([1. , nan, nan, 1.1])

Upvotes: 1

Related Questions