How to assign value to specific rows in dataframe that match a list?

Question

I have a dataframe on US Congressional races. I am wanting to mark several of these as abnormal. The list of abnormal ones is:

abnormal_race = ["CA6", "CA27", "CA44", "LA6", "WA9", "LA1", "LA3", "CA5", "CA34", "CA40", "MS2", "NY6", "NY7", "NY8", "LA2", "MI13", "TX9", "TX20", "TX28", "TX30", "WA2", "AL7", "AZ7", "FL20", "FL21", "FL24", "GA5", "MA1", "MA4", "MA7", "MA8", "NY5", "NY16", "PA18", "VA3", "WI2"]

The dataframe is called a_m_d_8. I started by assigning the abnormal column:

a_m_d_8["abnormal"] = 0

a_m_d_8 has a column called state_dist with the two-letter designation for each state, and district number, for every congressional race in the dataset. It's in the same format as abnormal_race above. I want to check each row, and if its state_dist value is in abnormal_race, assign a value of 1, but have had trouble. My attempts so far have gotten a "ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()." error message.

An example attempt:

a_m_d_8.loc[a_m_d_8.state_dist in abnormal_race, "abnormal"] = 1

This threw the above error.

What does work for individual cases is:

a_m_d_8.loc[a_m_d_8.state_dist == "AL7", "abnormal"] = 1

This assigns a value of 1 to the "AL7" row. How should I go about doing this for all of the ones in abnormal_race? There must be a better way than just running the above for every one in abnormal_race.

Thanks for the help.

Ynjxsjmh · Accepted Answer

You can use pandas.Series.isin(values) to find whether elements in Series are contained in values.

a_m_d_8.loc[a_m_d_8.state_dist.isin(abnormal_race), "abnormal"] = 1

The reason why you got the error is that in doesn't support Series object.

How to assign value to specific rows in dataframe that match a list?

Answers (1)

Related Questions