lwebgru
lwebgru

Reputation: 23

Python/Pandas: How do I replace specific values of a Pandas Data Frame based on individual id?

I have a long Pandas dataset that contains a column called 'id' and another column called 'species', among other columns. I have to perform a change on the 'species' column, based on specific values of the 'id' column.

For example, if the 'id' is '5555555' (as a string), then I want that the 'species' value change its current value 'dove' (also a string) to 'hummingbird'. So far I have been using the method:

df.loc[df["id"] == '5555555', "species"] = 'hummingbird'

Here is short sample data frame:

import pandas as pd
        
#Starting dataset
d = {'id': ['11111111', '22222222', '33333333', '44444444', '55555555', '66666666', '77777777', '88888888'], 'species': ['dove', 'dove', 'dove', 'hummingbird', 'hummingbird', 'dove', 'hummingbird', 'dove']}
df = pd.DataFrame(data=d)
df
    
    id          species
0   11111111    dove
1   22222222    dove        #wants to replace
2   33333333    dove        #wants to replace
3   44444444    hummingbird
4   55555555    hummingbird
5   66666666    dove
6   77777777    hummingbird
7   88888888    dove        #wants to replace        
     
#Expected outcome
d = {'id': ['11111111', '22222222', '33333333', '44444444', '55555555', '66666666', '77777777', '88888888'], 'species': ['dove', 'hummingbird', 'hummingbird', 'hummingbird', 'hummingbird', 'dove', 'hummingbird', 'hummingbird']}
df = pd.DataFrame(data=d)
df
    
    id          species
0   11111111    dove
1   22222222    hummingbird #replaced
2   33333333    hummingbird #replaced
3   44444444    hummingbird
4   55555555    hummingbird
5   66666666    dove
6   77777777    hummingbird
7   88888888    hummingbird #replaced

This is ok for a small number of lines, but I have to do this to about 1000 lines with individual 'id' each, so I thought that maybe a loop that I could feed it the list of 'id', but I honestly do not know how to even start.

Thanks in advance!!

and thanks to Scott Boston for pointing me out in the right direction to ask better questions!

Upvotes: 2

Views: 594

Answers (1)

Vishnudev Krishnadas
Vishnudev Krishnadas

Reputation: 10960

Use isin

humming_ids = [44444444, 5555555, 88888888]
df.loc[df.id.isin(humming_ids), "species"] = 'hummingbird'

Upvotes: 1

Related Questions