Reputation: 1225
I have a df that looks like this:
pd.DataFrame.from_dict({'master_feature':['ab',float('NaN'),float('NaN')],
'feature':[float('NaN'),float('NaN'),'pq'],
'epic':[float('NaN'),'fg',float('NaN')]})
I want to create a new column named promoted
from the columns master_feature, epic, and feature:
value of promoted
will be :
master feature
if adjacent master_feature
column value is not
null.feature
if adjacent feature
column value is not null ,and likewise
for epic
something like:
df.promoted = 'master feature' if not pd.isnull(df.master_feature)
elseif 'feature' if not pd.isnull(df.feature)
elseif 'epic' pd.isnull(df.epic)
else 'Na'
how can I achieve this using a df.apply
?
is it much more efficient if I use np.select
?
Upvotes: 0
Views: 324
Reputation: 31206
It can be done with combination of apply()
and numpy argmin()
df = pd.DataFrame.from_dict({'master_feature':['ab',float('NaN'),float('NaN')],
'feature':[float('NaN'),float('NaN'),'pq'],
'epic':[float('NaN'),'fg',float('NaN')]})
df.assign(promoted=lambda dfa: dfa.apply(lambda r: r[np.argmin(r.isna())], axis=1))
Upvotes: 1
Reputation: 14113
np.select
is the way to go. Try below . . . I think I got the logic correct based on your question. Also, there is some discrepancy in your logic: "feature if adjacent feature column value is not null ,and likewise for epic" is not the same as "elseif 'epic' pd.isnull(df.epic)" So I went with if df['epic'] is not null then 'epic'
Let me know if that is correct.
cond = [~df['master_feature'].isna(), # if master_feater is not null then 'master feater'
~df['feature'].isna(), # if feature is not null then 'feature
~df['epic'].isna()] # if epic is not null then 'epic'
choice = ['master feature',
'feature',
'epic']
df['promoted'] = np.select(cond, choice, np.nan)
master_feature feature epic promoted
0 ab NaN NaN master feature
1 NaN NaN fg epic
2 NaN pq NaN feature
Upvotes: 1