Lisa
Lisa

Reputation: 337

create indicator for missing values in a data frame in python

I would like to create an indicator column in my data frame which shows me if values are missing in other columns. For example:

| var_1 | var_2 | indicator|  
--------------------------
|   3   |  2   |  1  |
|  NaN  |  4   |  2  |
|   1   | NaN  |  3  |

As you can see, the new column "indicator" should be 1 if no value is missing in var_1 and var_2, it should be 2 if only var_1 is missing and 3 if only var_2 is missing. Some piece of code would be very helpful. Thank you!

Upvotes: 4

Views: 1590

Answers (1)

anky
anky

Reputation: 75120

Use np.select() which is fast too.

import numpy as np
df['indicator']=np.select([df.var_1.isnull(),df.var_2.isnull()],[2,3],1)
print(df)

   var_1  var_2  indicator
0    3.0    2.0          1
1    NaN    4.0          2
2    1.0    NaN          3

Upvotes: 4

Related Questions