Futurex
Futurex

Reputation: 23

Python dataframe create index column based on other id column

I have a dataframe like this:

ID                  Price
000afb96ded6677c    1514.5
000afb96ded6677c    13.0
000afb96ded6677c    611.0
000afb96ded6677c    723.0
000afb96ded6677c    2065.0
ffea14e87a4e1269    2286.0
ffea14e87a4e1269    1150.0
ffea14e87a4e1269    80.0
fff455057ad492da    650.0
fff5fc66c1fd66c2    450.0

I need an ID column which iterates from 1 to however many rows there are but i need it to be like in the code below:

ID                  Price    ID 2
000afb96ded6677c    1514.5   1
000afb96ded6677c    13.0     1
000afb96ded6677c    611.0    1
000afb96ded6677c    723.0    1
000afb96ded6677c    2065.0   1
ffea14e87a4e1269    2286.0   2
ffea14e87a4e1269    1150.0   2
ffea14e87a4e1269    80.0     2
fff455057ad492da    650.0    3
fff5fc66c1fd66c2    450.0    4

Upvotes: 0

Views: 255

Answers (1)

Henry Ecker
Henry Ecker

Reputation: 35686

Try groupby ngroup + 1 :

df['ID_2'] = df.groupby('ID').ngroup() + 1

Or with Rank:

df['ID_2'] = df['ID'].rank(method='dense').astype(int)

Or with pd.factorize:

df['ID_2'] = pd.factorize(df['ID'])[0] + 1

df:

                 ID   Price  ID_2
0  000afb96ded6677c  1514.5     1
1  000afb96ded6677c    13.0     1
2  000afb96ded6677c   611.0     1
3  000afb96ded6677c   723.0     1
4  000afb96ded6677c  2065.0     1
5  ffea14e87a4e1269  2286.0     2
6  ffea14e87a4e1269  1150.0     2
7  ffea14e87a4e1269    80.0     2
8  fff455057ad492da   650.0     3
9  fff5fc66c1fd66c2   450.0     4

Upvotes: 1

Related Questions