Reputation: 357
I have a data frame that sometimes has multiple values in cells like this:
df:
Fruits
apple, pineapple, mango
guava, blueberry, apple
custard-apple, cranberry
banana, kiwi, peach
apple
Now, I want to filter the data frame having an apple in the value. So my output should look like this:
Fruits
apple, pineapple, mango
guava, blueberry, apple
apple
I used the str.contains('apple') but this is not returning the ideal result.
Can anyone help me with how I can get this result?
Upvotes: 2
Views: 358
Reputation: 3708
You can use .query with .contains
import pandas as pd
data = {
"Fruits": ["apple, pineapple, mango", "guava, blueberry, apple", "custard-apple, cranberry",
"banana, kiwi, peach", "apple"]
}
df = pd.DataFrame(data)
df = df.query("Fruits.str.contains('apple') & ~Fruits.str.contains('-apple')").reset_index(drop=True)
print(df)
Fruits
0 apple, pineapple, mango
1 guava, blueberry, apple
2 apple
Upvotes: 1
Reputation: 150735
You can split the data by ,
, explode them, then compare with apple
:
mask = df['Fruits'].str.split(', ').explode().eq('apple').groupby(level=0).any()
df[mask]
Output:
Fruits
0 apple, pineapple, mango
1 guava, blueberry, apple
4 apple
Upvotes: 1
Reputation: 64
Here you go,
apple = df[df.values == "apple"]
print("The df with apple:", apple)
Upvotes: -1