Reputation: 129
I need to clean my data frame and remove all columns without numeric data. I have columns with are classified as "object" and some marked as int/float, but containing mostly NaNs. I would like to keep only columns filled with numbers. Is there way to do it?
Upvotes: 3
Views: 63
Reputation: 394031
Use select_dtypes
and pass np.number
to filter numeric types only:
In [69]:
df = pd.DataFrame({'int':np.arange(5), 'float':np.random.randn(5), 'str':list('abcde')})
df
Out[69]:
float int str
0 0.987218 0 a
1 0.336119 1 b
2 1.800194 2 c
3 4.566850 3 d
4 -0.306808 4 e
In [71]:
df.select_dtypes([np.number])
Out[71]:
float int
0 0.987218 0
1 0.336119 1
2 1.800194 2
3 4.566850 3
4 -0.306808 4
This accepts any type in the numpy type hierarchy
To remove columns that contain any NaN
s then you can call dropna(axis=1)
thanks @Leb
Upvotes: 3