Bob
Bob

Reputation: 879

Convert pandas series into integers

Given a dataframe like this:

'John', 0.25
'Mary', 0.2
'Adam', 0.1
'Andrew', 0.6

I would like to generate a unique integer for every category in a certain series. For example, in the case above, the output could be something like this

0, 0.25
1, 0.2
2, 0.1
3, 0.6

possibly with pandas or standard libraries only.

Upvotes: 2

Views: 759

Answers (1)

jezrael
jezrael

Reputation: 862761

I think you can use factorize like:

print df
          a     b
0    'John'  0.25
1    'Mary'  0.20
2    'Mary'  0.20
3    'Adam'  0.10
4    'Adam'  0.10
5    'Adam'  0.10
6  'Andrew'  0.60

print pd.factorize(df.a)
(array([0, 1, 1, 2, 2, 2, 3]), 
 Index([u''John'', u''Mary'', u''Adam'', u''Andrew''], dtype='object'))

df['a'] = pd.factorize(df.a)[0]
print df

   a     b
0  0  0.25
1  1  0.20
2  1  0.20
3  2  0.10
4  2  0.10
5  2  0.10
6  3  0.60

Upvotes: 1

Related Questions