rafat.ch
rafat.ch

Reputation: 113

Python Dataframe get the NaN columns for each row

I have a pandas data frame that looks like this:

      a     b      c
0   NaN   2.0  165.0
1   NaN   9.0    NaN
2   NaN   NaN    NaN
3  15.0  15.0    NaN
4   5.0   NaN   11.0

I would like to add a column that gives me something like a summary of missing values. So, I need a command which gives me the list of columns with missing values for every row. Something like this:

      a     b      c    summary
0   NaN   2.0  165.0          a
1   NaN   9.0    NaN      a + c
2   NaN   NaN    NaN  a + b + c
3  15.0  15.0    NaN          c
4   5.0   NaN   11.0          b

Upvotes: 3

Views: 588

Answers (1)

jpp
jpp

Reputation: 164613

Here is one way.

import pandas as pd
import numpy as np

df = pd.DataFrame(
    [
        [np.nan, 2, 165], 
        [np.nan, 9, np.nan], 
        [np.nan, np.nan, np.nan],
        [15, 15, np.nan], 
        [5, np.nan, 11]
    ], 
    columns=['a', 'b', 'c']
)

df['Errors'] = df.apply(lambda row: ' + '.join(i for i in ['a', 'b', 'c'] if np.isnan(row[i])), axis=1)

Upvotes: 3

Related Questions