Reputation: 129
I have a pandas dataframe with multiple IDs and with other columns I have one date columns say : 'date1'. I want to get all the rows for a ID which has both the values 'a' and 'b' in the column 'value'. The others column values should also be retained.
What I have:
ID date1 value
1 1/1/2013 a
1 4/1/2013 a
1 8/3/2014 b
2 11/4/2013 a
2 19/5/2016 a
3 8/4/2017 b
3 19/4/2017 a
4 19/5/2016 a
4 8/4/2017 b
4 19/4/2017 a
5 19/5/2016 b
5 8/4/2017 b
What I want :
ID date1 value
1 1/1/2013 a
1 4/1/2013 a
1 8/3/2014 b
3 8/4/2017 b
3 19/4/2017 a
4 19/5/2016 a
4 8/4/2017 b
4 19/4/2017 a
Upvotes: 0
Views: 1250
Reputation: 19957
(
df.groupby('ID')
.filter(lambda x: all([e in x.value.tolist() for e in ('a','b')]))
)
Upvotes: 1
Reputation: 18426
You can group dataframe by ID
, then filter
the groups which has a
, and b
in the value
columns using set.issubset
df.groupby('ID').filter(lambda x: {'a', 'b'}.issubset(x['value']))
ID date1 value
0 1 1/1/2013 a
1 1 4/1/2013 a
2 1 8/3/2014 b
5 3 8/4/2017 b
6 3 19/4/2017 a
7 4 19/5/2016 a
8 4 8/4/2017 b
9 4 19/4/2017 a
Upvotes: 4