caasswa
caasswa

Reputation: 521

How to rename Pandas columns based on mapping?

I have a dataframe where column Name contains values such as the following (the rest of the columns do not affect how this question is answered I hope):

Chicken
Chickens
Fluffy Chicken
Whale
Whales
Blue Whale
Tiger
White Tiger
Big Tiger

Now, I want to ensure that we rename these entries to be like the following:

Chicken
Chicken
Chicken
Whale
Whale
Whale
Tiger
Tiger
Tiger

Essentially substituting anything that has 'Chicken' to just be 'Chicken, anything with 'Whale' to be just 'Whale, and anything with 'Tiger' to be just 'Tiger'.

What is the best way to do this? There are almost 1 million rows in the dataframe.

Sorry just to add, I have a list of what we expect i.e.

['Chicken', 'Whale', 'Tiger']

They can appear in any order in the column

What I should also add is, the column might contain things like "Mushroom" or "Eggs" that do not need substituting from the original list.

Upvotes: 0

Views: 433

Answers (1)

BENY
BENY

Reputation: 323226

Try with str.extract

#l = ['Chicken', 'Whale', 'Tiger']

df['new'] = df['col'].str.extract('('+'|'.join(l)+')')[0]
Out[10]: 
0    Chicken
1    Chicken
2    Chicken
3      Whale
4      Whale
5      Whale
6      Tiger
7      Tiger
8      Tiger
Name: 0, dtype: object

Upvotes: 2

Related Questions