Reputation: 783
Hel lo, I have a df such as :
col1 col2
G1 A
G1 B
G1 C
G1 D
G2 E
G2 F
G2 G
G3 H
G4 I
G4 J
G4 K
and a liste=['A','I','K']
and I would like to remove all groups that does not contain into the col2 one element present in the liste.
Here I should keep only G1
and G4
and get :
col1 col2
G1 A
G1 B
G1 C
G1 D
G4 I
G4 J
G4 K
Does someone have in idea ?
Upvotes: 1
Views: 59
Reputation: 42916
isin
, GroupBy.transform
and any
First we use isin
to check which rows contain an element from your liste
. Then we GroupBy
on col1
and check if any
of the rows in a group contain an element of the list`
The reason we use transform
here over simple GroupBy.any
is because we want to get a vector back, with the same length as your dataframe, to do row wise comparison.
df[df['col2'].isin(liste).groupby(df['col1']).transform('any')]
col1 col2
0 G1 A
1 G1 B
2 G1 C
3 G1 D
8 G4 I
9 G4 J
10 G4 K
Upvotes: 3
Reputation: 149075
You could use groupby and apply:
df.groupby('col1').apply(lambda x: x if any(i in x['col2'].values for i in liste)
else None).reset_index(level=0, drop=True)
It gives:
col1 col2
0 G1 A
1 G1 B
2 G1 C
3 G1 D
8 G4 I
9 G4 J
10 G4 K
Upvotes: 1