How to create a new column in a dataframe based off values of other columns?

Question

I have a starting dateframe like the one presented below.

df_Start = pd.DataFrame({'P':['IG','HY','IG',np.nan,np.nan], 'M':['HY','HY','IG', np.nan,np.nan], 'F' ['HY',np.nan,'HY', np.nan,np.nan],'D':['IG','IG','IG', 'HY',np.nan]})`

I would like to create a new column titled 'Result' based off the row values in columns P, M, F, D. The conditions I want the 'Result' column to follow are 1) If any of the rows in columns P,M,F,D contain 'IG' then the Result column will populate with 'IG' 2) If any of the rows in columns P,M,F,D contain 'HY' and no 'IG' then the 'Result' column will populate with 'HY' 3) If a row in columns P,M,F,D do not contain 'HY' or 'IG' and are just 'NaN' then the Result column will populate with 'None'

If these conditions are applied correctly, the final df would result would like the below df.

df_end = pd.DataFrame({'Result':['IG','IG','IG', 'HY',None],'P':['IG','HY','IG',np.nan,np.nan], 'M':['HY','HY','IG', np.nan,np.nan], 'F':['HY',np.nan,'HY', np.nan,np.nan],'D':['IG','IG','IG', 'HY',np.nan]})

Celius Stingher · Accepted Answer

Maybe this works for you?

import pandas as pd
import numpy as np 
df_start =pd.DataFrame({'P':['IG','HY','IG',np.nan,np.nan], 'M':['HY','HY','IG', np.nan,np.nan], 'F':['HY',np.nan,'HY', np.nan,np.nan],'D':['IG','IG','IG', 'HY',np.nan]})
print(df_start)
df_start['aux'] = df_start.stack().groupby(level=0).apply(list)
df_start['result'] = np.where(df_start['aux'].astype(str).str.contains('IG'),'IG',np.where(df_start['aux'].astype(str).str.contains('HY'),'HY',np.nan))
print(df_start)

Output:

     P    M    F    D               aux result
0   IG   HY   HY   IG  [IG, HY, HY, IG]     IG
1   HY   HY  NaN   IG      [HY, HY, IG]     IG
2   IG   IG   HY   IG  [IG, IG, HY, IG]     IG
3  NaN  NaN  NaN   HY              [HY]     HY
4  NaN  NaN  NaN  NaN               NaN    nan

How to create a new column in a dataframe based off values of other columns?

Answers (1)

Related Questions