yellow_smile
yellow_smile

Reputation: 129

Get IDs which have multiple values in a column

I have a pandas dataframe with multiple IDs and with other columns I have one date columns say : 'date1'. I want to get all the rows for a ID which has both the values 'a' and 'b' in the column 'value'. The others column values should also be retained.

What I have:

 ID   date1        value
 1    1/1/2013     a
 1    4/1/2013     a
 1    8/3/2014     b
 2    11/4/2013    a
 2    19/5/2016    a
 3    8/4/2017     b
 3    19/4/2017    a
 4    19/5/2016    a
 4    8/4/2017     b
 4    19/4/2017    a
 5    19/5/2016    b
 5    8/4/2017     b

What I want :

 ID   date1        value
 1    1/1/2013     a
 1    4/1/2013     a
 1    8/3/2014     b
 3    8/4/2017     b
 3    19/4/2017    a
 4    19/5/2016    a
 4    8/4/2017     b
 4    19/4/2017    a

Upvotes: 0

Views: 1250

Answers (2)

Allen Qin
Allen Qin

Reputation: 19957

(
    df.groupby('ID')
    .filter(lambda x: all([e in x.value.tolist() for e in ('a','b')]))
)

Upvotes: 1

ThePyGuy
ThePyGuy

Reputation: 18426

You can group dataframe by ID, then filter the groups which has a, and b in the value columns using set.issubset

df.groupby('ID').filter(lambda x: {'a', 'b'}.issubset(x['value']))

   ID      date1 value
0   1   1/1/2013     a
1   1   4/1/2013     a
2   1   8/3/2014     b
5   3   8/4/2017     b
6   3  19/4/2017     a
7   4  19/5/2016     a
8   4   8/4/2017     b
9   4  19/4/2017     a

Upvotes: 4

Related Questions