Reputation: 319
I have the following base table that I would like to separate out into a has guava table and a does not have guava table. I'm thinking of using a flag to get the intermediate table below but not sure where to go from there.
base table
user_id fruit
user1 passionfruit
user1 guava
user1 banana
user2 orange
user2 coconut
user3 guava
user4 melon
has guava
user_id fruit
user1 passionfruit
user1 guava
user1 banana
user3 guava
does not have guava
user_id fruit
user2 orange
user2 coconut
user4 melon
intermediate table
user_id fruit has_guava
user1 passionfruit 0
user1 guava 1
user1 banana 0
user2 orange 0
user2 coconut 0
user3 guava 1
user4 melon 0
Upvotes: 2
Views: 2608
Reputation: 323266
Without groupby
check isin
out = df[df.user_id.isin(df.loc[df.fruit.isin(['guava']),'user_id'])]
Out[322]:
user_id fruit
0 user1 passionfruit
1 user1 guava
2 user1 banana
5 user3 guava
Upvotes: 2
Reputation: 30032
Try groupby
then filter
.
df_ = (df.
groupby('user_id').
filter(lambda group: group['fruit'].eq('guava').any())
)
print(df_)
user_id fruit
0 user1 passionfruit
1 user1 guava
2 user1 banana
5 user3 guava
Upvotes: 1