Edward De Brouwer
Edward De Brouwer

Reputation: 57

Uniqueness index by group in pandas

I have a dataframe that basically looks like :

A=pd.DataFrame({'Id':[1,1,1,1,2,2,2],'Time'[0,0.5,1.2,1.2,0.2,0.2,0.5]})
idx Id  Time
0   1   0.0
1   1   0.5
2   1   1.2
3   1   1.2
4   2   0.2
5   2   0.2
6   2   0.5
7   3   1.7

and I would like to assign an time-ordering index to the different values of Time for each Id. So basically this columns should give the order to the Time value for this specific Id. Furthermore, entries with same Id and same Time should have the same ordering index. I should then have :

idx Id  Time Time_order
0   1   0.0  0
1   1   0.5  1
2   1   1.2  2
3   1   1.2  2
4   2   0.2  0
5   2   0.2  0
6   2   0.5  1
7   3   1.7  0

How can I do that efficiently without for-loops ? Thanks so much.

Upvotes: 1

Views: 45

Answers (1)

piRSquared
piRSquared

Reputation: 294488

groupby with rank using method='dense'

A.assign(Time_order=A.groupby('Id').Time.rank(method='dense') - 1)

   idx  Id  Time  Time_order
0    0   1   0.0         0.0
1    1   1   0.5         1.0
2    2   1   1.2         2.0
3    3   1   1.2         2.0
4    4   2   0.2         0.0
5    5   2   0.2         0.0
6    6   2   0.5         1.0
7    7   3   1.7         0.0

astype(int) if you really want to

A.assign(Time_order=A.groupby('Id').Time.rank(method='dense').astype(int) - 1)

   idx  Id  Time  Time_order
0    0   1   0.0           0
1    1   1   0.5           1
2    2   1   1.2           2
3    3   1   1.2           2
4    4   2   0.2           0
5    5   2   0.2           0
6    6   2   0.5           1
7    7   3   1.7           0

pd.factorize

A.assign(Time_order=A.groupby('Id').Time.transform(lambda x: pd.factorize(x, sort=True)[0]))

   idx  Id  Time  Time_order
0    0   1   0.0         0.0
1    1   1   0.5         1.0
2    2   1   1.2         2.0
3    3   1   1.2         2.0
4    4   2   0.2         0.0
5    5   2   0.2         0.0
6    6   2   0.5         1.0
7    7   3   1.7         0.0

Upvotes: 4

Related Questions