Ananda
Ananda

Reputation: 3272

How to encode a range of values using pandas

I have a pandas dataframe and and having a column age. I want to encode it into categorical values separated by specific range, for example, ages below 15 should be 0, between 15 and 30 should be changed to 1 and so on.

I found this way to do this(after going through a huge confusion about the use of & and and)

age = X.loc[:, 'Age']

age[ age<15 ] = 0
age[ (15<age) & (age<=30) ] = 1
age[ (30<age) & (age<=50) ] = 2
age[ (50<age) & (age<=80) ] = 3

Is this the best way to so this? Can I do this, for example with LabelEncoder?

Upvotes: 4

Views: 5164

Answers (1)

jezrael
jezrael

Reputation: 862791

You can use cut:

df = pd.DataFrame({'Age':[0,1,14,15,30,31,50,51,79,80]})

bins = [0,14,30,50,80]
labels=[0,1,2,3]
df['bins'] = pd.cut(df['Age'], bins=bins, labels=labels, include_lowest=True)
print (df)
   Age bins
0    0    0
1    1    0
2   14    0
3   15    1
4   30    1
5   31    2
6   50    2
7   51    3
8   79    3
9   80    3

Upvotes: 7

Related Questions