Reputation: 23
I have a dataframe like this:
ID Price
000afb96ded6677c 1514.5
000afb96ded6677c 13.0
000afb96ded6677c 611.0
000afb96ded6677c 723.0
000afb96ded6677c 2065.0
ffea14e87a4e1269 2286.0
ffea14e87a4e1269 1150.0
ffea14e87a4e1269 80.0
fff455057ad492da 650.0
fff5fc66c1fd66c2 450.0
I need an ID column which iterates from 1 to however many rows there are but i need it to be like in the code below:
ID Price ID 2
000afb96ded6677c 1514.5 1
000afb96ded6677c 13.0 1
000afb96ded6677c 611.0 1
000afb96ded6677c 723.0 1
000afb96ded6677c 2065.0 1
ffea14e87a4e1269 2286.0 2
ffea14e87a4e1269 1150.0 2
ffea14e87a4e1269 80.0 2
fff455057ad492da 650.0 3
fff5fc66c1fd66c2 450.0 4
Upvotes: 0
Views: 255
Reputation: 35686
Try groupby ngroup
+ 1 :
df['ID_2'] = df.groupby('ID').ngroup() + 1
Or with Rank
:
df['ID_2'] = df['ID'].rank(method='dense').astype(int)
Or with pd.factorize
:
df['ID_2'] = pd.factorize(df['ID'])[0] + 1
df
:
ID Price ID_2
0 000afb96ded6677c 1514.5 1
1 000afb96ded6677c 13.0 1
2 000afb96ded6677c 611.0 1
3 000afb96ded6677c 723.0 1
4 000afb96ded6677c 2065.0 1
5 ffea14e87a4e1269 2286.0 2
6 ffea14e87a4e1269 1150.0 2
7 ffea14e87a4e1269 80.0 2
8 fff455057ad492da 650.0 3
9 fff5fc66c1fd66c2 450.0 4
Upvotes: 1