Falco
Falco

Reputation: 183

How to select the 3 last dates in Python

I have a dataset that looks like his:

ID date
1  O1-01-2012
1  05-02-2012
1  25-06-2013
1  14-12-2013
1  10-04-2014
2  19-05-2012
2  07-08-2014
2  10-09-2014
2  27-11-2015
2  01-12-2015
3  15-04-2013
3  17-05-2015
3  22-05-2015
3  30-10-2016
3  02-11-2016

I am working with Python and I would like to select the 3 last dates for each ID. Here is the dataset I would like to have:

ID date
1  25-06-2013
1  14-12-2013
1  10-04-2014
2  10-09-2014
2  27-11-2015
2  01-12-2015
3  22-05-2015
3  30-10-2016
3  02-11-2016

I used this code to select the very last date for each ID:

df_2=df.sort_values(by=['date']).drop_duplicates(subset='ID',keep='last')

But how can I select more than one date (for example the 3 last dates, or 4 last dates, etc)?

Upvotes: 1

Views: 271

Answers (3)

Anuj Suhag
Anuj Suhag

Reputation: 71

I tried this but with a non-datetime data type

a = [1,1,1,1,1,2,2,2,2,2,3,3,3,3,3]
b = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']

import pandas as pd
import numpy as np
a = np.array([a,b])

df=pd.DataFrame(a.T,columns=['ID','Date'])

# the tail would give you the last n number of elements you are interested in
df_ = df.groupby('ID').tail(3)

df_

output:

  ID    Date
2   1   c
3   1   d
4   1   e
7   2   h
8   2   i
9   2   j
12  3   m
13  3   n
14  3   o

Upvotes: 0

gpliu
gpliu

Reputation: 249

can try this:

df.sort_values(by=['date']).groupby('ID').tail(3).sort_values(['ID', 'date'])

Upvotes: 1

Daweo
Daweo

Reputation: 36360

You might use groupby and tail following way to get 2 last items from each group:

import pandas as pd
df = pd.DataFrame({'ID':[1,1,1,2,2,2,3,3,3],'value':['A','B','C','D','E','F','G','H','I']})
df2 = df.groupby('ID').tail(2)
print(df2)

Output:

   ID value
1   1     B
2   1     C
4   2     E
5   2     F
7   3     H
8   3     I

Note that for simplicity sake I used other (already sorted) data for building df.

Upvotes: 2

Related Questions