Reputation: 745

Filtering in Pandas

What I'm trying to do is probably best illustrated by an example. Say we have the following dataframe:

ID            Category          Label          Price
----------------------------------------------------
00001         Low               Alpha          1.00
00001         Low               Beta           1.50
00001         Med               Chi            2.00
00001         Med               Delta          2.50
00001         High              Epsilon        3.00
00001         High              Phi            3.50
00002         Low               Alpha          1.00
00002         Low               Beta           1.50
00002         Med               Chi            2.50
00002         Med               Delta          2.50
00002         High              Epsilon        3.00
00002         High              Phi            3.50

For each ID and each Label within each ID, I want to return the Label with the highest price, along with the price. For example:

ID            Category          Label          Price
----------------------------------------------------
00001         Low               Beta           1.50
00001         Med               Delta          2.50
00001         High              Phi            3.50
00002         Low               Beta           1.50
00002         Med               Delta          2.50
00002         High              Phi            3.50

Initially, I thought of doing this with nested FOR statements - like iterating through a multivariate array, but I know that's not the Pandas way.

Upvotes: 0

Answers (2)

dz3fri

Reputation: 23

Similarly, you can groupby ID, Category and Label then aggregate on Price

(df
.groupby(['ID','Category','Label'])
.agg(Price =('Price','max'))
).reset_index()

Upvotes: 1

Scott Boston

Reputation: 153460

IIUC, you can try this:

df.loc[df.groupby(['ID','Category'], group_keys=False)['Price'].idxmax()]

Output:

       ID Category  Label  Price
5   00001     High    Phi    3.5
1   00001      Low   Beta    1.5
3   00001      Med  Delta    2.5
11  00002     High    Phi    3.5
7   00002      Low   Beta    1.5
8   00002      Med    Chi    2.5

Upvotes: 3

Filtering in Pandas

Answers (2)

Related Questions