dsmess
dsmess

Reputation: 67

Create a new column based on value in another column

I have a data frame:

 id|concept    |description
 12|           |rewards member
 12|tier one   |
 12|not avail  |rewards member

GOAL: Create a new column final_desc with the content in either the concept or description column

There are 4 possible scenarios:

  1. There is a value in concept column and not in description , in which final_desc is the value in the concept

  2. There is a value in description column and not in concept, in which final_desc is the value in the description

  3. The value in concept column is not avail , in which final_desc is the value in the description

  4. Both the concept and description column are empty, in which final_desc is empty

I tried using a where statement but that does not account for scenario 3.

df['final_desc'] = np.where(df['concept'].isnull(), df['description'], df['concept'])

I think I need a custom function but am not sure how to write to work across columns

Upvotes: 0

Views: 51

Answers (2)

Hamza
Hamza

Reputation: 6055

This might do the trick:

df['final_desc'] = df.concept.replace('not avail',np.nan).fillna(df.description).fillna(df.concept.replace('not avail',np.nan))

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150825

You can combine a replace and ffill/bfill:

df['final_desc'] = (df[['concept','description']].replace('not avail',np.nan)
                     .bfill(1)['concept']
                   )

Output:

   id    concept     description      final_desc
0  12        NaN  rewards member  rewards member
1  12   tier one             NaN        tier one
2  12  not avail  rewards member  rewards member

Upvotes: 1

Related Questions