Reputation: 1272
Searched and tried several answers here on SO, but they are all for returning rows with NaN's. I'd like to return only the columns with NaN values. For example the following df. How can I select columns 'A' and 'LG'?
df = pd.DataFrame(
{'H': ['a','b', 'c'],
'A': [np.nan,'d', 'e'],
'LG':['AR1', 'RO1', np.nan],
})
print(df)
A H LG
0 NaN a AR1
1 d b RO1
2 e c NaN
Upvotes: 3
Views: 7331
Reputation: 862751
I think you need first replace strings NaN
to np.nan
in sample:
df = pd.DataFrame(
{'H': ['a','b', 'c'],
'A': [np.nan,'d', 'e'],
'LG':['AR1', 'RO1', np.nan],
})
mask = df.isnull().any()
print (mask)
A True
H False
LG True
dtype: bool
Last use boolean indexing with index
:
print (mask.index[mask])
Index(['A', 'LG'], dtype='object')
If need columns add loc
:
print (df.loc[:, mask])
A LG
0 NaN AR1
1 d RO1
2 e NaN
Upvotes: 4
Reputation: 1444
This returns all columns containing NaN
.
df = pd.DataFrame(
{'H': ['a','b', 'c'],
'A': [np.nan,'d', 'e'],
'LG':['AR1', 'RO1', np.nan],
})
x = pd.isnull(df).any()
print(df[x.index[x]])
A LG
0 NaN AR1
1 d RO1
2 e Na
Upvotes: 1