Pandas - compare 2 columns and choose value based on Priority

Question

Below is my input dataframe

df = pd.DataFrame({'Level_DB': ['Level 1 Experienced' ,'Level 2 Expert', 'Level 1 Experienced', 'Level 2 Expert', 'Level 3 Thought Leader', 'Level 1 Experienced', 'Non-Certified', 'Level 3 Thought Leader', 'Certified', 'Certified', np.nan, 'Level 1 Experienced'], 
                    'Level_Legacy' :[ 'Certified', 'Level 1 Experienced', 'Level 3 Thought Leader', 'Level 3 Thought Leader Recert', 'Level 3 Thought Leader Recert', 'Non-Certified', 'non-certified', 'Level 2 Expert Recert', 'Level 1 Experienced', 'Non-Certified', 'Certified', '']})

And, the target column:'Output' should be produced after comparing both input columns 'Level_DB' and 'Level_Legacy' and choose the highest priority value. The priority List is as below

priority_List = ['Level 3 Thought Leader', 'Level 3 Thought Leader New', 'Level 3 Thought Leader Recert', 'Level 3 Thought Leader Recert Lapsed',
                 'Level 2 Expert', 'Level 2 Expert New', 'Level 2 Expert Recert', 'Level 2 Expert Recert Lapsed',
                 'Level 1 Experienced', 'Level 1 Experienced New', 'Level 1 Experienced Recert', 'Level 1 Experienced Recert Lapsed', 'Certified', 'Non-Certified' , 'non-certified']

Expected final DataFrame with desired 'Output' column as below

Nothing strikes my brain to start with. Please help

jezrael · Accepted Answer

Idea is create ordered categoricals, with reshape by DataFrame.stack, so output is max value per level=0:

from pandas.api.types import CategoricalDtype
cat_type = CategoricalDtype(categories=priority_List[::-1],ordered=True)

#solution if more columns in data
#df['Output'] = df[['Level_DB','Level_Legacy']].stack().astype(cat_type).max(level=0)
df['Output'] = df.stack().astype(cat_type).max(level=0)
print (df)
                  Level_DB                   Level_Legacy  \
0      Level 1 Experienced                      Certified   
1           Level 2 Expert            Level 1 Experienced   
2      Level 1 Experienced         Level 3 Thought Leader   
3           Level 2 Expert  Level 3 Thought Leader Recert   
4   Level 3 Thought Leader  Level 3 Thought Leader Recert   
5      Level 1 Experienced                  Non-Certified   
6            Non-Certified                  non-certified   
7   Level 3 Thought Leader          Level 2 Expert Recert   
8                Certified            Level 1 Experienced   
9                Certified                  Non-Certified   
10                     NaN                      Certified   
11     Level 1 Experienced                                  

                           Output  
0             Level 1 Experienced  
1                  Level 2 Expert  
2          Level 3 Thought Leader  
3   Level 3 Thought Leader Recert  
4          Level 3 Thought Leader  
5             Level 1 Experienced  
6                   Non-Certified  
7          Level 3 Thought Leader  
8             Level 1 Experienced  
9                       Certified  
10                      Certified  
11            Level 1 Experienced

Pandas - compare 2 columns and choose value based on Priority

Answers (2)

Related Questions