Reputation: 475
Suppose my dataset looks like this:
data = {'name':['A1','A2','A2','A1','A3','A3'],
'date_create':['28-01-2020','15-02-2020','15-03-2020',
'25-03-2020','01-04-2020','30-05-2020']}
df = pd.DataFrame(data)
I want to create a new column that indicate the order of month, group by each name. For example the customer A1 has 2 date '28-01-2020' and '25-03-2020', so the order of month for this name are 2 and 1 (the most recent are 1)
The result should looks like this:
name date_create order
0 A1 28-01-2020 2
1 A2 15-02-2020 2
2 A2 15-03-2020 1
3 A1 25-03-2020 1
4 A3 01-04-2020 2
5 A3 30-05-2020 1
Upvotes: 1
Views: 51
Reputation: 26676
sort_values, groupby name and cuncount
df['order']=df.sort_values(by=['name', 'date_create']).groupby('name').cumcount()+1
name date_create order
0 A1 28-01-2020 2
1 A2 15-02-2020 1
2 A2 15-03-2020 2
3 A1 25-03-2020 1
4 A3 01-04-2020 1
5 A3 30-05-2020 2
Upvotes: 1
Reputation: 863166
Use GroupBy.rank
with method='dense'
and ascending=False
:
df['date_create'] = pd.to_datetime(df['date_create'])
df['rank'] = (df.groupby('name')['date_create']
.rank(method='dense', ascending=False)
.astype(int))
print (df)
name date_create rank
0 A1 2020-01-28 2
1 A2 2020-02-15 2
2 A2 2020-03-15 1
3 A1 2020-03-25 1
4 A3 2020-01-04 2
5 A3 2020-05-30 1
Upvotes: 2
Reputation: 14949
IIUC, you can try:
df.date_create = pd.to_datetime(df.date_create)
df['order'] = df.sort_values(['name', 'date_create'], ascending=False).groupby('name')['date_create'].cumcount() + 1
df = df.sort_index()
name date_create order
0 A1 2020-01-28 2
1 A2 2020-02-15 2
2 A2 2020-03-15 1
3 A1 2020-03-25 1
4 A3 2020-01-04 2
5 A3 2020-05-30 1
Upvotes: 1