Reputation: 57
I have a dataframe that basically looks like :
A=pd.DataFrame({'Id':[1,1,1,1,2,2,2],'Time'[0,0.5,1.2,1.2,0.2,0.2,0.5]})
idx Id Time
0 1 0.0
1 1 0.5
2 1 1.2
3 1 1.2
4 2 0.2
5 2 0.2
6 2 0.5
7 3 1.7
and I would like to assign an time-ordering index to the different values of Time for each Id. So basically this columns should give the order to the Time value for this specific Id. Furthermore, entries with same Id and same Time should have the same ordering index. I should then have :
idx Id Time Time_order
0 1 0.0 0
1 1 0.5 1
2 1 1.2 2
3 1 1.2 2
4 2 0.2 0
5 2 0.2 0
6 2 0.5 1
7 3 1.7 0
How can I do that efficiently without for-loops ? Thanks so much.
Upvotes: 1
Views: 45
Reputation: 294488
groupby
with rank
using method='dense'
A.assign(Time_order=A.groupby('Id').Time.rank(method='dense') - 1)
idx Id Time Time_order
0 0 1 0.0 0.0
1 1 1 0.5 1.0
2 2 1 1.2 2.0
3 3 1 1.2 2.0
4 4 2 0.2 0.0
5 5 2 0.2 0.0
6 6 2 0.5 1.0
7 7 3 1.7 0.0
astype(int)
if you really want to
A.assign(Time_order=A.groupby('Id').Time.rank(method='dense').astype(int) - 1)
idx Id Time Time_order
0 0 1 0.0 0
1 1 1 0.5 1
2 2 1 1.2 2
3 3 1 1.2 2
4 4 2 0.2 0
5 5 2 0.2 0
6 6 2 0.5 1
7 7 3 1.7 0
pd.factorize
A.assign(Time_order=A.groupby('Id').Time.transform(lambda x: pd.factorize(x, sort=True)[0]))
idx Id Time Time_order
0 0 1 0.0 0.0
1 1 1 0.5 1.0
2 2 1 1.2 2.0
3 3 1 1.2 2.0
4 4 2 0.2 0.0
5 5 2 0.2 0.0
6 6 2 0.5 1.0
7 7 3 1.7 0.0
Upvotes: 4