Reputation: 537
I have a dataframe with an ID column and a measurement column
ID measurement
1 A
1 B
1 C
1 D
2 A
2 B
2 C
2 D
3 A
4 A
4 B
4 C
4 D
4 E
I want to select only rows which have full measurements (A-D) per ID and remove rows which either have fewer (for example ID 3) or more measurements (for example ID 4) per ID.
Is there an efficient way to achieve that?
Upvotes: 1
Views: 572
Reputation: 76927
Using transform
In [1001]: df[df.groupby('ID')['measurement'].transform(lambda x: set(x) == set('ABCD'))]
Out[1001]:
ID measurement
0 1 A
1 1 B
2 1 C
3 1 D
4 2 A
5 2 B
6 2 C
7 2 D
Upvotes: 1
Reputation: 210852
In [92]: df.groupby('ID').filter(lambda x: set(x['measurement']) == set('ABCD'))
Out[92]:
ID measurement
0 1 A
1 1 B
2 1 C
3 1 D
4 2 A
5 2 B
6 2 C
7 2 D
Upvotes: 2