kiyas
kiyas

Reputation: 155

Conditional selection of values from other column

Continuation to my previous question

Some of the rows in my dataset are messy

Serial     Val1      Val2      Val3      
1          21.10     NaN       13.51     
1          43.06     NaN       20.51     
1          32.12     NaN       NaN       
2          NaN       11.20     NaN       
2          NaN       NaN       NaN       
3          45.10     NaN       NaN       
3          14.16     NaN       NaN      
4          NaN       34.90     NaN       
4          NaN       12.12     11.10     
4          NaN       18.09     NaN       

These are grouped based on their unique Serial. For example, Serial 1 has values for Val1 and Val3 but I would still prefer to choose values from Val1 for the ['All'] column. In choosing which Val columns to select for ['All'], Val1 if available then Val2 if Val1 is not available... (Val1>Val2>Val3)

Serial     Val1      Val2      Val3      All       Source
1          21.10     NaN       13.51     21.10     Val1
1          43.06     NaN       20.51     43.06
1          32.12     NaN       NaN       32.12
2          NaN       11.20     NaN       11.20     Val2
2          NaN       NaN       NaN       NaN  
3          45.10     NaN       NaN       45.10     Val1
3          14.16     NaN       NaN       14.16
4          NaN       34.90     NaN       34.90     Val2
4          NaN       12.12     11.10     12.12    
4          NaN       18.09     NaN       18.09     

Thank you

Upvotes: 0

Views: 66

Answers (1)

jezrael
jezrael

Reputation: 863731

You can first backfilling missing values and seelct first val by positions with DataFrame.iloc and for second use same solution like before:

df1 = df[['Val1','Val2','Val3']]

mask = df1.isna().all(axis=1)
mask1 = df['Serial'].duplicated()

df = (df.assign(All = df1.bfill(axis=1).iloc[:, 0],
                Source = df1.notna().idxmax(axis=1).mask(mask1 | mask)))
print (df)
   Serial   Val1   Val2   Val3    All Source
0       1  21.10    NaN  13.51  21.10   Val1
1       1  43.06    NaN  20.51  43.06    NaN
2       1  32.12    NaN    NaN  32.12    NaN
3       2    NaN  11.20    NaN  11.20   Val2
4       2    NaN    NaN    NaN    NaN    NaN
5       3  45.10    NaN    NaN  45.10   Val1
6       3  14.16    NaN    NaN  14.16    NaN
7       4    NaN  34.90    NaN  34.90   Val2
8       4    NaN  12.12  11.10  12.12    NaN
9       4    NaN  18.09    NaN  18.09    NaN

Upvotes: 2

Related Questions