TylerNG
TylerNG

Reputation: 941

Pandas output null column

My goal is to add a column where it would display the field name(s) of the empty cell.

Name x  y z
abc  1    3
        0  
ijk  m    
lmn  1  2 3 

The new column would be:

Name x  y z   Empty
abc  1    3     y
        0      x,Name,z
ijk  m         y,z
lmn  1  a c   

I've tried: pd.isnull(df).any(1).nonzero() but this only show the row that contains empty cell.

Many thanks! :)

Upvotes: 2

Views: 156

Answers (2)

BENY
BENY

Reputation: 323306

df['Missing']=df.where((df=='')).stack().reset_index().groupby('level_0')['level_1'].apply(','.join)
df
Out[222]: 
  Name  x  y  z   Missing
0  abc  1     3         y
1          0     Name,x,z
2  ijk  m             y,z
3  lmn  1  2  3       NaN

Upvotes: 1

Orenshi
Orenshi

Reputation: 1873

Well... this isn't a vectorized solution but it does the job. Maybe someone else will come along and have a better way. You'll have to fix the get_null_col_name function to check and consider 0 and NaN as well. But this should give you an idea.

>>> df
   Name     x     y     z
0   abc     1  None     3
1  None  None     0  None
2   ijk     m  None  None
3   lmn     1     a     c
>>> def get_null_col_name(row):
...     return ','.join([col for col in row.index if not row[col]])
...
>>> df['Empty'] = df.apply(get_null_col_name, axis=1)
>>> df
   Name     x     y     z       Empty
0   abc     1  None     3           y
1  None  None     0  None  Name,x,y,z
2   ijk     m  None  None         y,z
3   lmn     1     a     c       Empty

Upvotes: 1

Related Questions