pelopid
pelopid

Reputation: 39

python isnull().sum() handle headers

I have a dataset in which I want to count the missing values for each column. If there are missing values, I want to print the header name. I use the following code in order to find the missing values per column

isnull().sum()

If I print the result everything is OK, if I try to put the result in a list and then handle the headers, I can't!

newList = pd.isnull(myData).sum()
print(newList)

In this case the output is:

Name             5
Surname          0
Age              3

and I want to print only Surname but I can't find how to return it to a variable.

newList = pd.isnull(myData).sum()
print(newList[0])

This print 5 (the number of missing values for column 'Name')

Upvotes: 0

Views: 1718

Answers (1)

jezrael
jezrael

Reputation: 863301

Use boolean indexing with Series:

df = pd.DataFrame({'A':list('abcdef'),
                   'B':[4,5,4,5,5,4],
                   'C':[np.nan,8,9,4,2,3],
                   'D':[1,3,5,np.nan,1,0],
                   'E':[5,3,6,9,2,4],
                   'F':list('aaabbb')})

print (df)
   A  B    C    D  E  F
0  a  4  NaN  1.0  5  a
1  b  5  8.0  3.0  3  a
2  c  4  9.0  5.0  6  a
3  d  5  4.0  NaN  9  b
4  e  5  2.0  1.0  2  b
5  f  4  3.0  0.0  4  b

newList = df.isnull().sum()
print (newList)
A    0
B    0
C    1
D    1
E    0
F    0
dtype: int64

#for return NaNs columns
print(newList.index[newList != 0].tolist())
['C', 'D']

#for return non NaNs columns
print(newList.index[newList == 0].tolist())
['A', 'B', 'E', 'F']

Upvotes: 2

Related Questions