MI MA
MI MA

Reputation: 181

Pandas DataFrame: Changing column values

I have a large dataframe with the column DOB and ID:

import pandas as pd 
df = pd.read_csv('data.csv')

df.head() 

ID      DOB
223725  1975.0
223725  1975.0
223725  1975.0
223725  1975.0
223725  1975.0

There are 63 different years in DOB. I want to change the values in this column so that each year is replaced by a simple number. For example, the lowest value or year 1911 is changed to a value of 1, the 2nd lowest value in DOB is replaced by 2, the 3rd lowest by 3 etc.

How do I make this change fast?

Upvotes: 0

Views: 33

Answers (1)

jezrael
jezrael

Reputation: 862481

You can use Series.rank:

df['DOB1'] = df['DOB'].rank(method='dense')
print (df)
       ID     DOB  DOB1
0  223725  1911.0   1.0
1  223725  2000.0   3.0
2  223725  2006.0   4.0
3  223725  1985.0   2.0
4  223725  1911.0   1.0

Upvotes: 2

Related Questions