Reputation: 23
I have a long Pandas dataset that contains a column called 'id'
and another column called 'species'
, among other columns. I have to perform a change on the 'species'
column, based on specific values of the 'id'
column.
For example, if the 'id'
is '5555555'
(as a string), then I want that the 'species'
value change its current value 'dove'
(also a string) to 'hummingbird'
. So far I have been using the method:
df.loc[df["id"] == '5555555', "species"] = 'hummingbird'
Here is short sample data frame:
import pandas as pd
#Starting dataset
d = {'id': ['11111111', '22222222', '33333333', '44444444', '55555555', '66666666', '77777777', '88888888'], 'species': ['dove', 'dove', 'dove', 'hummingbird', 'hummingbird', 'dove', 'hummingbird', 'dove']}
df = pd.DataFrame(data=d)
df
id species
0 11111111 dove
1 22222222 dove #wants to replace
2 33333333 dove #wants to replace
3 44444444 hummingbird
4 55555555 hummingbird
5 66666666 dove
6 77777777 hummingbird
7 88888888 dove #wants to replace
#Expected outcome
d = {'id': ['11111111', '22222222', '33333333', '44444444', '55555555', '66666666', '77777777', '88888888'], 'species': ['dove', 'hummingbird', 'hummingbird', 'hummingbird', 'hummingbird', 'dove', 'hummingbird', 'hummingbird']}
df = pd.DataFrame(data=d)
df
id species
0 11111111 dove
1 22222222 hummingbird #replaced
2 33333333 hummingbird #replaced
3 44444444 hummingbird
4 55555555 hummingbird
5 66666666 dove
6 77777777 hummingbird
7 88888888 hummingbird #replaced
This is ok for a small number of lines, but I have to do this to about 1000 lines with individual 'id'
each, so I thought that maybe a loop that I could feed it the list of 'id'
, but I honestly do not know how to even start.
Thanks in advance!!
and thanks to Scott Boston for pointing me out in the right direction to ask better questions!
Upvotes: 2
Views: 594
Reputation: 10960
Use isin
humming_ids = [44444444, 5555555, 88888888]
df.loc[df.id.isin(humming_ids), "species"] = 'hummingbird'
Upvotes: 1