immaprogrammingnoob
immaprogrammingnoob

Reputation: 167

Grouping values into custom bins

I have a data frame with an 'education' attribute. Values are discrete, 1-16. For purposes of cross-tabulation, I want to bin this 'education' variable but with custom bins (1:8, 9:11, 12, 13:15, 16).

I've been fooling around with pd.cut() but I get an invalid syntax error

adult_df_educrace['education_bins'] = pd.cut(x=adult_df_educrace['education'], bins=[1:8, 9, 10:11, 12, 13:15, 16], labels = ['Middle School or less', 'Some High School', 'High School Grad', 'Some College', 'College Grad'])

Upvotes: 0

Views: 232

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150745

Try making the bins fall between the thresholds:

bins = [0.5, 8.5, 11.5, 12.5, 15.5, 16.5]
labels=['Middle School or less', 'Some High School', 
        'High School Grad', 'Some College', 'College Grad']

adult_df_educrace['education_bins'] = pd.cut(x=adult_df_educrace['education'],
                                             bins=bins,
                                             labels=labels)

Test:

adult_df_educrace = pd.DataFrame({'education':np.arange(1,17)})

Output:

    education         education_bins
0           1  Middle School or less
1           2  Middle School or less
2           3  Middle School or less
3           4  Middle School or less
4           5  Middle School or less
5           6  Middle School or less
6           7  Middle School or less
7           8  Middle School or less
8           9       Some High School
9          10       Some High School
10         11       Some High School
11         12       High School Grad
12         13           Some College
13         14           Some College
14         15           Some College
15         16           College Grad

Upvotes: 1

Related Questions