Yury Moskaltsov
Yury Moskaltsov

Reputation: 57

How to turn unwanted string values into NaNs in pandas

I struggle with one task. I have imported an unclean dataframe and some columns that are supposed to have only float values also have strings which is corrupting my data and not allowing me to perform a regression.

If I have a dataframe X and "investment_rounds" column with mixed data types.

I want something like

np.where(X["investment_rounds"] == np.dtype.str, np.nan, X) 

Any ideas?

Upvotes: 0

Views: 514

Answers (1)

Chris
Chris

Reputation: 16147

They key here is the errors='coerce' parameter of to_numeric

Per the Documentation it will replace any value which cannot be converted with NaN

import pandas as pd
df = pd.DataFrame({'investment_rounds':['1.0','2.0','bad','data','3.0']})
df['investment_rounds'] = pd.to_numeric(df['investment_rounds'], errors='coerce')

Output

    investment_rounds
0   1.0
1   2.0
2   NaN
3   NaN
4   3.0

Upvotes: 1

Related Questions