Reputation: 1080
I have the following dataframe:
df = pd.DataFrame({'key1': (1,1,1,2), 'key2': (1,2,3,1), 'data1': ("test","test2","t","test")})
I want to group by key1 and have the min of data1. Further I want to preserve the according value of key2 without grouping on it.
df.groupby(['key1'], as_index=False)['data1'].min()
gets me:
key1 data1
1 t
2 test
but I need:
key1 key2 data1
1 3 t
2 1 test
Any ideas?
Upvotes: 2
Views: 639
Reputation: 29711
You can make use of groupby.apply
and retrieve all instances where x['data1']==x['data1'].min()
equals to True
while preserving the non-grouped columns as shown:
df.groupby('key1', group_keys=False).apply(lambda x: x[x['data1'].eq(x['data1'].min())])
To know what elements return True
, from which we subset the reduced DF
later:
df.groupby('key1').apply(lambda x: x['data1'].eq(x['data1'].min()))
key1
1 0 False
1 False
2 True
2 3 True
Name: data1, dtype: bool
Upvotes: 2