onetap
onetap

Reputation: 505

Remove non zero fields from df.isnull().sum()

I'm using df.isnull().sum() to get a count of NaN value in a pandas dataframe.

Is there a way to only show the value count that isn't zero (i.e. if the column has 0 NaNs then don't show in the value count.

This is the result and I would like to remove the 0 values

Job ID                              0
Agency                              0
Posting Type                        0
# Of Positions                      0
Business Title                      0
Civil Service Title                 0
Title Code No                       0
Level                               0
Job Category                        2
Full-Time/Part-Time indicator     261
Salary Range From                   0
Salary Range To                     0
Salary Frequency                    0
Work Location                       0
Division/Work Unit                  0
Job Description                     0
Minimum Qual Requirements          14
Preferred Skills                  377
Additional Information           1177
To Apply                            1
Hours/Shift                      2123
Work Location 1                  1719
Recruitment Contact              3116
Residency Requirement               0
Posting Date                        0
Post Until                       2214
Posting Updated                     0
Process Date                        0
qualifications                   3092
requir                             14
requir1                            14

Upvotes: 4

Views: 4541

Answers (2)

Cleb
Cleb

Reputation: 26027

Probably not the most efficient solution, but also works:

df = pd.DataFrame({'a': [1,2,np.nan], 'b': [3,4,5]}

df.isnull().sum().to_frame(name='counts').query('counts > 0')

yields

   counts
a       1

Upvotes: 5

brentertainer
brentertainer

Reputation: 2198

You can store the Series you have shown as nullseries, then filter that. For example,

nullseries = df.isnull().sum()
print(nullseries[nullseries > 0])

If you need to remove the rows altogether, then reassign.

nullseries = nullseries[nullseries > 0]

Here is a short working example:

In [46]: df = pd.DataFrame([[np.NaN, 1], [2, 3], [np.NaN, 4]], columns=['x', 'y'])

In [47]: nullseries = df.isnull().sum()

In [48]: nullseries[nullseries > 0]
Out[48]:
x    2
dtype: int64

Upvotes: 12

Related Questions