Reputation: 75
I have a starting dateframe like the one presented below.
df_Start = pd.DataFrame({'P':['IG','HY','IG',np.nan,np.nan], 'M':['HY','HY','IG', np.nan,np.nan], 'F' ['HY',np.nan,'HY', np.nan,np.nan],'D':['IG','IG','IG', 'HY',np.nan]})`
I would like to create a new column titled 'Result' based off the row values in columns P, M, F, D. The conditions I want the 'Result' column to follow are 1) If any of the rows in columns P,M,F,D contain 'IG' then the Result column will populate with 'IG' 2) If any of the rows in columns P,M,F,D contain 'HY' and no 'IG' then the 'Result' column will populate with 'HY' 3) If a row in columns P,M,F,D do not contain 'HY' or 'IG' and are just 'NaN' then the Result column will populate with 'None'
If these conditions are applied correctly, the final df would result would like the below df.
df_end = pd.DataFrame({'Result':['IG','IG','IG', 'HY',None],'P':['IG','HY','IG',np.nan,np.nan], 'M':['HY','HY','IG', np.nan,np.nan], 'F':['HY',np.nan,'HY', np.nan,np.nan],'D':['IG','IG','IG', 'HY',np.nan]})
Upvotes: 2
Views: 66
Reputation: 18377
Maybe this works for you?
import pandas as pd
import numpy as np
df_start =pd.DataFrame({'P':['IG','HY','IG',np.nan,np.nan], 'M':['HY','HY','IG', np.nan,np.nan], 'F':['HY',np.nan,'HY', np.nan,np.nan],'D':['IG','IG','IG', 'HY',np.nan]})
print(df_start)
df_start['aux'] = df_start.stack().groupby(level=0).apply(list)
df_start['result'] = np.where(df_start['aux'].astype(str).str.contains('IG'),'IG',np.where(df_start['aux'].astype(str).str.contains('HY'),'HY',np.nan))
print(df_start)
Output:
P M F D aux result
0 IG HY HY IG [IG, HY, HY, IG] IG
1 HY HY NaN IG [HY, HY, IG] IG
2 IG IG HY IG [IG, IG, HY, IG] IG
3 NaN NaN NaN HY [HY] HY
4 NaN NaN NaN NaN NaN nan
Upvotes: 3