Reputation: 315
So I'm trying to make a simple filter that will take in the dataframe and filter out all rows that don't have the target genre. It'll be easier to explain with the code:
import pandas as pd
test = [{
"genre":["RPG","Shooter"]},
{"genre":["RPG"]},
{"genre":["Shooter"]}]
data =pd.DataFrame(test)
fil = data.genre.isin(['RPG'])
I want the filter to return a dataframe with the following elements:
[{"genre":["RPG"]},
{"genre":["RPG", "Shooter"]}]
This is the error I'm getting when I try my code:
SystemError: <built-in method view of numpy.ndarray object at 0x00000180D1DF2760> returned a result with an error set
Upvotes: 0
Views: 80
Reputation: 61910
The problem is that the elements of genre are lists, so isin does not work. Use:
mask = data['genre'].apply(frozenset(['RPG']).issubset)
print(data[mask])
Output
genre
0 [RPG, Shooter]
1 [RPG]
The expression:
frozenset(['RPG']).issubset
Checks that any list is contained in each row, from the documentation:
Test whether every element in the set is in other.
So you could also check for multiple values easily, for example:
mask = data['genre'].apply(frozenset(['RPG', "Shooter"]).issubset)
print(data[mask])
Output
genre
0 [RPG, Shooter]
Upvotes: 1
Reputation: 150735
You want:
data[data.genre.apply(lambda x: 'RPG' in x)]
Or:
data[data.genre.explode().eq('RPG').any(level=0)]
Output:
genre
0 [RPG, Shooter]
1 [RPG]
Upvotes: 0