Reputation: 303
I have dataset in given form
time color height weight value
1 t1 red hr1 wr1 vr1
2 t1 red hr1 wr1 vr1
3 t1 blue hb1 wb1 vb1
4 t1 blue hb1 wb1 vb1
5 t1 green hg1 wg1 vg1
6 t1 green hg1 wg1 vg1
7 t2 blue hb2 wb2 vb2
8 t2 green hg2 wg2 vg2
9 t2 red hr2 wr2 vr2
10 t2 red hr2 wr2 vr2
11 t3 red hr3 wr3 vr3
12 t3 red hr3 wr3 vr3
13 t3 green hg3 wg3 vg3
14 t3 green hg3 wg3 vg3
15 t3 blue hb3 wb3 vb3
16 t3 blue hb3 wb3 vb3
I would like to drop the measurements of time where the color do not have same count value of 2 for each red, blue green. In the given snippet, t1 and t3 should be retained and all rows for t3 measurements should be dropped.
The result should be:
time color height weight value
1 t1 red hr1 wr1 vr1
2 t1 red hr1 wr1 vr1
3 t1 blue hb1 wb1 vb1
4 t1 blue hb1 wb1 vb1
5 t1 green hg1 wg1 vg1
6 t1 green hg1 wg1 vg1
7 t3 red hr3 wr3 vr3
8 t3 red hr3 wr3 vr3
9 t3 green hg3 wg3 vg3
10 t3 green hg3 wg3 vg3
11 t3 blue hb3 wb3 vb3
12 t3 blue hb3 wb3 vb3
Thank you,
Upvotes: 2
Views: 124
Reputation: 863701
Use double GroupBy.transform
for return Series with same size as original DataFrame, so possible use boolean indexing
:
df1 = df[df.groupby(['time', 'color'])['color']
.transform('size')
.eq(2)
.groupby(df['time'])
.transform('all')]
print (df1)
time color height weight value
1 t1 red hr1 wr1 vr1
2 t1 red hr1 wr1 vr1
3 t1 blue hb1 wb1 vb1
4 t1 blue hb1 wb1 vb1
5 t1 green hg1 wg1 vg1
6 t1 green hg1 wg1 vg1
11 t3 red hr3 wr3 vr3
12 t3 red hr3 wr3 vr3
13 t3 green hg3 wg3 vg3
14 t3 green hg3 wg3 vg3
15 t3 blue hb3 wb3 vb3
16 t3 blue hb3 wb3 vb3
Upvotes: 0
Reputation: 18647
How about:
s = df.groupby(['time', 'color']).size()
s = s.unstack(0).eq(2).all()
valid_times = s.index[s]
print(df[df.time.isin(valid_times)])
time color height weight value
1 t1 red hr1 wr1 vr1
2 t1 red hr1 wr1 vr1
3 t1 blue hb1 wb1 vb1
4 t1 blue hb1 wb1 vb1
5 t1 green hg1 wg1 vg1
6 t1 green hg1 wg1 vg1
11 t3 red hr3 wr3 vr3
12 t3 red hr3 wr3 vr3
13 t3 green hg3 wg3 vg3
14 t3 green hg3 wg3 vg3
15 t3 blue hb3 wb3 vb3
16 t3 blue hb3 wb3 vb3
Upvotes: 1