Reputation: 3058
I have the following data frame:
import pandas as pd
data = dict(name=['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c'],
objective=[20.0, 20.0, 25.0, 40.0, 40.5, 41.0, 60.0, 60.0],
price=[0.5, 1.0, 1.5, 1.0, 1.2, 1.4, 0.5, 1.0])
df = pd.DataFrame(data, columns=data.keys())
I can then find the unique combinations of all these as follows:
df.groupby(['name','objective', 'price']).size()
which looks like this:
name objective price
a 20.0 0.5 1
1.0 1
25.0 1.5 1
b 40.0 1.0 1
40.5 1.2 1
41.0 1.4 1
c 60.0 0.5 1
1.0 1
When there are multiple price
values for a given name
and objective
, I want to keep the lower price
value only, i.e.
name objective price
a 20.0 0.5 1
25.0 1.5 1
b 40.0 1.0 1
40.5 1.2 1
41.0 1.4 1
c 60.0 0.5 1
How can I achieve this, please?
Upvotes: 0
Views: 46
Reputation: 323316
What I will do
df.sort_values('price').drop_duplicates(['name','objective'],keep='last').assign(cnt=1)
Out[421]:
name objective price cnt
0 a 20.0 0.5 1
2 a 25.0 1.5 1
3 b 40.0 1.0 1
4 b 40.5 1.2 1
5 b 41.0 1.4 1
6 c 60.0 0.5 1
Upvotes: 2
Reputation: 507
you can use groupby and minimum
df = df.groupby(['name','objective']).min()
Upvotes: 1
Reputation: 150775
You can do another groupby
and first
:
(df.groupby(['name','objective', 'price']).size()
.reset_index()
.groupby(['name', 'objective'])
.first()
)
Output:
price 0
name objective
a 20.0 0.5 1
25.0 1.5 1
b 40.0 1.0 1
40.5 1.2 1
41.0 1.4 1
c 60.0 0.5 1
Upvotes: 2