Reputation: 45
I have a dataset with several columns. But for this question only two of them are important. The column Body and the column Valid, the first is a comment in twitter and the second is the output of a ML algorithm that determines if it is valid or not for the project that I am working on.
The problem is that I have a list of tweets from the Body column that have being predicted wrongly. What I want to do is to change that value on Valid column if the body column coincides with any of the values inside wrong_one(which is a list).
So taking into account that wrong_one is a list and that raw_data is my dataframe.
I have tried this:
raw_data = pd.DataFrame(
{
"SYS-ID":[1,2,3,4,5,6,7,8],
"BODY":["LOL1","LOL","lol","a","b","C","hey","ho"],
"VALID":[True,True,True,True,True,True,True,True]
})
wrong_one = ["LOL1,LOL"]
raw_data[raw_data['BODY'].isin(wrong_one), 'Valid'] = False
OUT: TypeError: 'Series' objects are mutable, thus they cannot be hashed
Upvotes: 2
Views: 186
Reputation: 164773
There are a couple of errors:
wrong_one
is a list of one string, you want a list of multiple strings.pd.DataFrame.loc
, not pd.DataFrame.__getitem__
(for which raw_data[]
is syntactic sugar), is required for setting by row and column indexers.So you can use:
wrong_one = ['LOL1', 'LOL']
raw_data.loc[raw_data['BODY'].isin(wrong_one), 'VALID'] = False
See also Indexing and Selecting Data from the official docs.
Upvotes: 1