Reputation: 111
I have a dataframe on US Congressional races. I am wanting to mark several of these as abnormal. The list of abnormal ones is:
abnormal_race = ["CA6", "CA27", "CA44", "LA6", "WA9", "LA1", "LA3", "CA5", "CA34", "CA40", "MS2", "NY6", "NY7", "NY8", "LA2", "MI13", "TX9", "TX20", "TX28", "TX30", "WA2", "AL7", "AZ7", "FL20", "FL21", "FL24", "GA5", "MA1", "MA4", "MA7", "MA8", "NY5", "NY16", "PA18", "VA3", "WI2"]
The dataframe is called a_m_d_8
. I started by assigning the abnormal
column:
a_m_d_8["abnormal"] = 0
a_m_d_8
has a column called state_dist
with the two-letter designation for each state, and district number, for every congressional race in the dataset. It's in the same format as abnormal_race
above. I want to check each row, and if its state_dist
value is in abnormal_race
, assign a value of 1, but have had trouble. My attempts so far have gotten a "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()." error message.
An example attempt:
a_m_d_8.loc[a_m_d_8.state_dist in abnormal_race, "abnormal"] = 1
This threw the above error.
What does work for individual cases is:
a_m_d_8.loc[a_m_d_8.state_dist == "AL7", "abnormal"] = 1
This assigns a value of 1 to the "AL7" row. How should I go about doing this for all of the ones in abnormal_race
? There must be a better way than just running the above for every one in abnormal_race
.
Thanks for the help.
Upvotes: 0
Views: 43
Reputation: 29992
You can use pandas.Series.isin(values) to find whether elements in Series are contained in values.
a_m_d_8.loc[a_m_d_8.state_dist.isin(abnormal_race), "abnormal"] = 1
The reason why you got the error is that in
doesn't support Series object.
Upvotes: 1