Reputation: 19
id name age year
0 khu 12 2018
1 she 21 2019
2 waqar 22 2015
3 khu 12 2018
4 she 21 2018
5 waqar 22 2015
want like this
id name age year
0 khu 12 2018
1 she 21 2019
2 waqar 22 2015
0 khu 12 2018
1 she 21 2018
2 waqar 22 2015
Upvotes: 1
Views: 605
Reputation: 323226
Using factorize
as well you can check with category
and cat.codes
, or sklearn
LabelEncoder
df['id']=pd.factorize(df['name'])[0]
df
Out[470]:
id name age year
0 0 khu 12 2018
1 1 she 21 2019
2 2 waqar 22 2015
3 0 khu 12 2018
4 1 she 21 2018
5 2 waqar 22 2015
Upvotes: 3
Reputation: 862541
Use GroupBy.ngroup
:
df['id'] = df.groupby('name', sort=False).ngroup()
#if need grouping by multiple columns for check duplicates
#df['id'] = df.groupby(['name','age'], sort=False).ngroup()
print (df)
id name age year
0 0 khu 12 2018
1 1 she 21 2019
2 2 waqar 22 2015
3 0 khu 12 2018
4 1 she 21 2018
5 2 waqar 22 2015
Upvotes: 5