yololo
yololo

Reputation: 319

Pandas - filter rows with same value in one column and multiple values in another column based on the existence of a value in the latter column

I have the following base table that I would like to separate out into a has guava table and a does not have guava table. I'm thinking of using a flag to get the intermediate table below but not sure where to go from there.

base table

user_id fruit  
user1   passionfruit  
user1   guava
user1   banana
user2   orange
user2   coconut
user3   guava
user4   melon

has guava

user_id fruit  
user1   passionfruit  
user1   guava
user1   banana
user3   guava

does not have guava

user_id fruit  
user2   orange
user2   coconut
user4   melon

intermediate table

user_id fruit        has_guava
user1   passionfruit 0 
user1   guava        1
user1   banana       0
user2   orange       0
user2   coconut      0
user3   guava        1
user4   melon        0

Upvotes: 2

Views: 2608

Answers (2)

BENY
BENY

Reputation: 323266

Without groupby check isin

out = df[df.user_id.isin(df.loc[df.fruit.isin(['guava']),'user_id'])]
Out[322]: 
  user_id         fruit
0   user1  passionfruit
1   user1         guava
2   user1        banana
5   user3         guava

Upvotes: 2

Ynjxsjmh
Ynjxsjmh

Reputation: 30032

Try groupby then filter.

df_ = (df.
       groupby('user_id').
       filter(lambda group: group['fruit'].eq('guava').any())
)
print(df_)

  user_id         fruit
0   user1  passionfruit
1   user1         guava
2   user1        banana
5   user3         guava

Upvotes: 1

Related Questions