Reputation: 5
I have this code to replace ages from numeric data to categorical data. I'm trying to do it that way, but it's not working. Can anybody help me?
for df in treino_teste:
df.loc[df['Age'] <= 13, 'Age'] = 0,
df.loc[(df['Age'] > 13) & (df['Age'] <= 18), 'Age'] = 1,
df.loc[(df['Age'] > 18) & (df['Age'] <= 25), 'Age'] = 2,
df.loc[(df['Age'] > 25) & (df['Age'] <= 35), 'Age'] = 3,
df.loc[(df['Age'] > 35) & (df['Age'] <= 60), 'Age'] = 4,
df.loc[df['Age'] > 60, 'Age'] = 5
Error:
Upvotes: 0
Views: 351
Reputation: 953
You can use numpy.digitize()
bins = [0,13,18,25,35,60,100]
df['AgeC'] =numpy.digitize(df['Age'],bins)
Upvotes: 1
Reputation: 31166
df = pd.DataFrame({"Age":np.random.randint(1,65,10)}).sort_values(["Age"])
bins = [0,13,18,25,35,60,100]
df.assign(AgeB=pd.cut(df.Age, bins=bins, labels=[i for i,v in enumerate(bins[:-1])]))
Age | AgeB | |
---|---|---|
5 | 12 | 0 |
3 | 13 | 0 |
8 | 18 | 1 |
7 | 25 | 2 |
9 | 25 | 2 |
1 | 27 | 3 |
2 | 30 | 3 |
4 | 57 | 4 |
0 | 59 | 4 |
6 | 64 | 5 |
Upvotes: 1