Jose S. Ameijeiras
Jose S. Ameijeiras

Reputation: 45

Change a value of a column when it another column value is equal to one of the values from a list

I have a dataset with several columns. But for this question only two of them are important. The column Body and the column Valid, the first is a comment in twitter and the second is the output of a ML algorithm that determines if it is valid or not for the project that I am working on.

The problem is that I have a list of tweets from the Body column that have being predicted wrongly. What I want to do is to change that value on Valid column if the body column coincides with any of the values inside wrong_one(which is a list).

So taking into account that wrong_one is a list and that raw_data is my dataframe.

I have tried this:

raw_data = pd.DataFrame(
{
   "SYS-ID":[1,2,3,4,5,6,7,8],
    "BODY":["LOL1","LOL","lol","a","b","C","hey","ho"],
    "VALID":[True,True,True,True,True,True,True,True]
})
wrong_one = ["LOL1,LOL"]

raw_data[raw_data['BODY'].isin(wrong_one), 'Valid'] = False

OUT: TypeError: 'Series' objects are mutable, thus they cannot be hashed

Upvotes: 2

Views: 186

Answers (1)

jpp
jpp

Reputation: 164773

There are a couple of errors:

  • wrong_one is a list of one string, you want a list of multiple strings.
  • pd.DataFrame.loc, not pd.DataFrame.__getitem__ (for which raw_data[] is syntactic sugar), is required for setting by row and column indexers.

So you can use:

wrong_one = ['LOL1', 'LOL']

raw_data.loc[raw_data['BODY'].isin(wrong_one), 'VALID'] = False

See also Indexing and Selecting Data from the official docs.

Upvotes: 1

Related Questions