jchilton
jchilton

Reputation: 39

Change 'nan' as a string to NaN in multiple columns

I'm currently working with a file where a number of columns/series contained numeric and alpha characters. think 35 km/ph. Looking at .info the series is (obviously) an object because of the letters.

I replaced the letters with '' and was left with the numeric, but noticed .info kept as object. I found that each column contained strings of 'nan' and not empty cell NaN. e.g.

s = ['banana', 'apple', 'pear', 'cherry', 'apple', 'apple'. 'banana] 
t = [7,4,7,2,8,8,5]
x = [3,4,5,3,2, 'nan', 6,5]
y = [32,'nan', 'nan', 67,54,21,101]
z = ['nan', 654, 567, 876, 967, 'nan', 763]

I'm looking to search through specific columns x,y,z and replace the 'nan' with proper NaN.

I've managed to do this with individual columns:

df['x'].replace('nan', np.NaN)

But I'm wondering if there is a more efficient way to do this, e.g. creating my own def function, etc

I've also tried:

df[['x', 'y', 'z']].replace('nan', np.NaN, inplace=True)

But that didn't seem to work.

Upvotes: 2

Views: 563

Answers (1)

LeopardShark
LeopardShark

Reputation: 4416

You can do this.

df[["x", "y", "z"]] = df[["x", "y", "z"]].replace("nan", np.NaN)

Upvotes: 1

Related Questions