Reputation: 941
My goal is to add a column where it would display the field name(s) of the empty cell.
Name x y z
abc 1 3
0
ijk m
lmn 1 2 3
The new column would be:
Name x y z Empty
abc 1 3 y
0 x,Name,z
ijk m y,z
lmn 1 a c
I've tried: pd.isnull(df).any(1).nonzero() but this only show the row that contains empty cell.
Many thanks! :)
Upvotes: 2
Views: 156
Reputation: 323306
df['Missing']=df.where((df=='')).stack().reset_index().groupby('level_0')['level_1'].apply(','.join)
df
Out[222]:
Name x y z Missing
0 abc 1 3 y
1 0 Name,x,z
2 ijk m y,z
3 lmn 1 2 3 NaN
Upvotes: 1
Reputation: 1873
Well... this isn't a vectorized solution but it does the job. Maybe someone else will come along and have a better way. You'll have to fix the get_null_col_name
function to check and consider 0
and NaN
as well. But this should give you an idea.
>>> df
Name x y z
0 abc 1 None 3
1 None None 0 None
2 ijk m None None
3 lmn 1 a c
>>> def get_null_col_name(row):
... return ','.join([col for col in row.index if not row[col]])
...
>>> df['Empty'] = df.apply(get_null_col_name, axis=1)
>>> df
Name x y z Empty
0 abc 1 None 3 y
1 None None 0 None Name,x,y,z
2 ijk m None None y,z
3 lmn 1 a c Empty
Upvotes: 1