yangyang
yangyang

Reputation: 531

create new column from a list of columns and get only the last columns condition in pandas

I have this table where each row has different status that put in multiple columns. These 5 status is the combination of other columns, so that is why it has possibility to be nan_nan.

no   status1                   status2              status3    status4     status5  
1  ifa_accepted_w/comment  ifc_approved             nan_nan     nan_nan     nan_nan
2  ifa_accepted_w/comment  ifc_accepted_w/comment   nan_nan     nan_nan     nan_nan
3  ifa_rejected            ifa_approved             nan_nan     nan_nan     nan_nan
4  ifa_rejected            ifa_approved             nan_nan     nan_nan     nan_nan
5  ifc_approved              nan_nan                nan_nan     nan_nan     nan_nan
6  ifa_accepted_w/comment    ifc_nan                nan_nan     nan_nan     nan_nan

I need to create new column where shows the last status from the 5 status column of each row. I have tried below script which first I define all the rules I have, then I create a new table by mapping from the rules.

response_rules = {
    "approved": ["ifc_approved"],
    "sub": ["ifc_accepted_w/comment", "ifc_rejected", "ifa_accepted_w/comment"],
    "tp": ["ifc_nan", "ifa_approved"]
}

inverted_rules = {vv: k for k, v in response_rules.items() for vv in v}

df["status"] = df.ffill(axis=1)["status5].map(inverted_rules)

But the scripts did not work. It gives me NaN value in the new column instead of the string from my response_rules.

The expected result is:

no   status1                   status2              status3    status4     status5     status
1  ifa_accepted_w/comment  ifc_approved             nan_nan     nan_nan     nan_nan     approved
2  ifa_accepted_w/comment  ifc_accepted_w/comment   nan_nan     nan_nan     nan_nan     sub
3  ifa_rejected            ifa_approved             nan_nan     nan_nan     nan_nan     tp
4  ifa_rejected            ifa_approved             nan_nan     nan_nan     nan_nan     tp
5  ifc_approved              nan_nan                nan_nan     nan_nan     nan_nan     approved
6  ifa_accepted_w/comment    ifc_nan                nan_nan     nan_nan     nan_nan     tp

what is wrong with the script I have? is there any other way where I can get my desired table?

Thankyou!

Upvotes: 0

Views: 32

Answers (1)

jezrael
jezrael

Reputation: 862901

I think you need replace nan_nan to missing values:

df["status"] = df.replace('nan_nan', np.nan).ffill(axis=1)["status5"].map(inverted_rules)
#for select ast column is possible use also iloc
#df["status"] = df.replace('nan_nan', np.nan).ffill(axis=1).iloc[:, -1].map(inverted_rules)
print (df)
   no                 status1                 status2  status3  status4  \
0   1  ifa_accepted_w/comment            ifc_approved  nan_nan  nan_nan   
1   2  ifa_accepted_w/comment  ifc_accepted_w/comment  nan_nan  nan_nan   
2   3            ifa_rejected            ifa_approved  nan_nan  nan_nan   
3   4            ifa_rejected            ifa_approved  nan_nan  nan_nan   
4   5            ifc_approved                 nan_nan  nan_nan  nan_nan   
5   6  ifa_accepted_w/comment                 ifc_nan  nan_nan  nan_nan   

   status5    status  
0  nan_nan  approved  
1  nan_nan       sub  
2  nan_nan        tp  
3  nan_nan        tp  
4  nan_nan  approved  
5  nan_nan        tp  

Upvotes: 1

Related Questions