Reputation: 67
I have a data frame:
id|concept |description
12| |rewards member
12|tier one |
12|not avail |rewards member
GOAL: Create a new column final_desc
with the content in either the concept
or description
column
There are 4 possible scenarios:
There is a value in concept
column and not in description
, in which final_desc
is the value in the concept
There is a value in description
column and not in concept
, in which final_desc
is the value in the description
The value in concept
column is not avail , in which final_desc
is the value in the description
Both the concept
and description
column are empty, in which final_desc
is empty
I tried using a where statement but that does not account for scenario 3.
df['final_desc'] = np.where(df['concept'].isnull(), df['description'], df['concept'])
I think I need a custom function but am not sure how to write to work across columns
Upvotes: 0
Views: 51
Reputation: 6055
This might do the trick:
df['final_desc'] = df.concept.replace('not avail',np.nan).fillna(df.description).fillna(df.concept.replace('not avail',np.nan))
Upvotes: 0
Reputation: 150825
You can combine a replace
and ffill/bfill
:
df['final_desc'] = (df[['concept','description']].replace('not avail',np.nan)
.bfill(1)['concept']
)
Output:
id concept description final_desc
0 12 NaN rewards member rewards member
1 12 tier one NaN tier one
2 12 not avail rewards member rewards member
Upvotes: 1