Reputation: 31
I have this dataset
age
24
32
29
23
23
31
25
26
34
I want to categorize using python and save the result to a new column "agegroup" such that age between; 23 to 26 to return 1 in the agegroup column, 27-30 to return value 2 in the agegroup column and 31-34 to return 3 in the agegroup column
Upvotes: 1
Views: 2208
Reputation: 1
You can use dictionaries to do this as well. Key-value pairs. The keys would be the different age ranges and the value for a particular key would be the count for that particular age group.
groupDict={'23-26':0,'27-30':0,'31-34':0}
for i in ages:
if i>=23 and i<=26:
groupDict['23-26']+=1
elif i>=27 and i<=30:
groupDict['27-30']+=1
elif i>=31 and i<=34:
groupDict['27-30']+=1
Upvotes: 0
Reputation: 78690
You can use pandas.cut
.
Given:
>>> df
age
0 24
1 32
2 29
3 23
4 23
5 31
6 25
7 26
8 34
Solution:
>>> df.assign(agegroup=pd.cut(df['age'], bins=[23, 27, 31, 35], right=False, labels=[1, 2, 3]))
age agegroup
0 24 1
1 32 3
2 29 2
3 23 1
4 23 1
5 31 3
6 25 1
7 26 1
8 34 3
Upvotes: 3