Number Logic
Number Logic

Reputation: 894

first record in a subset of a dataframe

I have a dataframe that looks something like:

enter image description here

I want to select all columns in the dataframe but only the top 1 record (based on amount). So in this case, I would expect to see:

1 Long  10
2 Short -2

Any ideas how to do this in Pandas ?

Upvotes: 1

Views: 136

Answers (2)

ansev
ansev

Reputation: 30930

We can also use DataFrame.sort_values with DataFrame.drop_duplicates by ID( it keep first by default):

df.sort_values("Amount", ascending=False).drop_duplicates('ID')

Another alternative is groupby.first with as_index = False to keep index like a column

df.sort_values("Amount", ascending=False).groupby("ID",as_index = False).first()

Upvotes: 2

DYZ
DYZ

Reputation: 57105

Sort the dataframe by the amount in the decreasing order (just in case they are not sorted yet), then group by the id (or position, whatever is relevant), and pick the first line from each group:

df.sort_values("Amount", ascending=False).groupby("ID").first()

Upvotes: 2

Related Questions