Reputation: 3272
I have a pandas dataframe and and having a column age
. I want to encode it into categorical values separated by specific range, for example, ages below 15 should be 0, between 15 and 30 should be changed to 1 and so on.
I found this way to do this(after going through a huge confusion about the use of &
and and
)
age = X.loc[:, 'Age']
age[ age<15 ] = 0
age[ (15<age) & (age<=30) ] = 1
age[ (30<age) & (age<=50) ] = 2
age[ (50<age) & (age<=80) ] = 3
Is this the best way to so this? Can I do this, for example with LabelEncoder?
Upvotes: 4
Views: 5164
Reputation: 862791
You can use cut
:
df = pd.DataFrame({'Age':[0,1,14,15,30,31,50,51,79,80]})
bins = [0,14,30,50,80]
labels=[0,1,2,3]
df['bins'] = pd.cut(df['Age'], bins=bins, labels=labels, include_lowest=True)
print (df)
Age bins
0 0 0
1 1 0
2 14 0
3 15 1
4 30 1
5 31 2
6 50 2
7 51 3
8 79 3
9 80 3
Upvotes: 7