Wizytor
Wizytor

Reputation: 105

Python select data for top 3 values per group in dataframe

From the given dataframe sorted by ID and Date:

ID  Date        Value
1   12/10/1998  0
1   04/21/2002  21030
1   08/16/2013  56792
1   09/18/2014  56792
1   09/14/2016  66354
2   06/16/2015  46645
2   12/08/2015  47641
2   12/11/2015  47641
2   04/13/2017  47641
3   07/29/2009  28616
3   03/31/2011  42127
3   03/17/2013  56000

I would like to get values for top 3 Dates, group by ID:

56792
56792
66354
47641
47641
47641
28616
42127
56000

I need values only

Upvotes: 2

Views: 454

Answers (1)

yatu
yatu

Reputation: 88226

You could sort_values both by ID and Date, and use GroupBy.tail to take the values for the top 3 dates:

df.Date = pd.to_datetime(df.Date)
df.sort_values(['ID','Date']).groupby('ID').Value.tail(3).to_numpy()

# array([56792, 56792, 66354, 47641, 47641, 47641, 28616, 42127, 56000])

Upvotes: 3

Related Questions