Reputation: 337
I would like to create an indicator column in my data frame which shows me if values are missing in other columns. For example:
| var_1 | var_2 | indicator|
--------------------------
| 3 | 2 | 1 |
| NaN | 4 | 2 |
| 1 | NaN | 3 |
As you can see, the new column "indicator" should be 1 if no value is missing in var_1 and var_2, it should be 2 if only var_1 is missing and 3 if only var_2 is missing. Some piece of code would be very helpful. Thank you!
Upvotes: 4
Views: 1590
Reputation: 75120
Use np.select()
which is fast too.
import numpy as np
df['indicator']=np.select([df.var_1.isnull(),df.var_2.isnull()],[2,3],1)
print(df)
var_1 var_2 indicator
0 3.0 2.0 1
1 NaN 4.0 2
2 1.0 NaN 3
Upvotes: 4