Reputation: 1159
I'm processing a dataset with around 2000 columns and I noticed many of them are empty. I want to know specifically how many of them are empty and how many are not. I use the following code:
df.isnull().sum()
I will get the number of empty rows in each columns. However, given I'm investigating about 2000 columns with 7593 rows, the output in IPython looks like the following:
FEMALE_DEATH_YR3_RT 7593
FEMALE_COMP_ORIG_YR3_RT 7593
PELL_COMP_4YR_TRANS_YR3_RT 7593
PELL_COMP_2YR_TRANS_YR3_RT 7593
...
FIRSTGEN_YR4_N 7593
NOT1STGEN_YR4_N 7593
It doesn't show all the columns because it has too many columns. Thus it makes it very difficult to tell how many columns are all empty and how any are not. I wonder is there anyway to allow me to identify the non-empty columns quickly? Thanks!
Upvotes: 3
Views: 6017
Reputation: 5668
to find the number of non empty columns:
len(df.columns) - len(df.dropna(axis=1,how='all').columns)
3
df
Country movie name rating year Something
0 thg John 3 NaN NaN NaN
1 thg Jan 4 NaN NaN NaN
2 mol Graham lob NaN NaN NaN
df=df.dropna(axis=1,how='all')
Country movie name
0 thg John 3
1 thg Jan 4
2 mol Graham lob
Upvotes: 4