create new column based on multiple condition and multiple column pandas

Question

Considering I also have another columns beside these columns below, I want to create a new column by these 3 columns that define the final status of each rows.

status_1                        status_2       status_3
a_accepted_with_comment         a_revised     c_approved
a_accepted_with_comment         c_rejected       nan
a_rejected                      a_approved       nan
a_rejected                         nan           nan

From the 3 column, if the last column which has value shows c_approved then the new column will give approved

From the 3 column, if the last column which has value shows c_rejected then the new column will give rejected

From the 3 column, if the last column that has value shows a_approved then the new column will give revised

From the 3 column, if the last column that has value shows a_rejected then the new column will give rejected

The final table would be like:

status_1                        status_2       status_3       final_status
a_accepted_with _comment         a_revised     c_approved       approved
a_accepted_with_comment         c_rejected       nan           rejected
b_rejected                      a_approved       nan           revised
a_rejected                       nan             nan           rejected

How can I make this new column with such multiple condition in python?

Thanks in advance.

Cameron Riddell · Accepted Answer

You can use ffill and map to keep track of each of your criteria and what they result in.

response_rules = {
    "c_approved": "approved",
    "c_rejected": "rejected",
    "a_approved": "revised",
    "a_rejected": "rejected"
}

df["final_status"] = df.ffill(axis=1)["status_3"].map(response_rules)
print(df)
                  status_1    status_2    status_3 final_status
0  a_accepted_with_comment   a_revised  c_approved     approved
1  a_accepted_with_comment  c_rejected         NaN     rejected
2               a_rejected  a_approved         NaN      revised
3               a_rejected         NaN         NaN     rejected

If you have a lot of rules, a better design pattern may be to keep an easily readable/editable dictionary that maps the outcome to each criterion, then invert it before calling .map

response_rules = {
    "approved": ["c_approved"],
    "rejected": ["c_rejected", "a_rejected"],
    "revised": ["a_approved"]
}
# invert dictionary
inverted_rules = {vv: k for k, v in response_rules.items() for vv in v}

# same as before
df["final_status"] = df.ffill(axis=1)["status_3"].map(inverted_rules)

print(df)
                  status_1    status_2    status_3 final_status
0  a_accepted_with_comment   a_revised  c_approved     approved
1  a_accepted_with_comment  c_rejected         NaN     rejected
2               a_rejected  a_approved         NaN      revised
3               a_rejected         NaN         NaN     rejected



# Just so you can see:
print(inverted_rules) 
{'a_approved': 'revised',
 'a_rejected': 'rejected',
 'c_approved': 'approved',
 'c_rejected': 'rejected'}

create new column based on multiple condition and multiple column pandas

Answers (2)

Related Questions