Reputation: 18123
I do the following cut on a data frame:
df['Age_Groups'] = pd.cut(df.Age, [0, 60, 120, 240, 360, 480, 600, 720, 940],
labels=['0-5', '5-10', '11-20', '21-30', '31-40', '41-50', '51-60', '> 60'])
Does this mean that values 0 to 60 are included in '0-5'? Is 60 excluded, or is zero excluded in 0-5, for example?
Upvotes: 0
Views: 3394
Reputation: 18628
You must accord your bins to the labels :
df['Age_Groups'] = pd.cut(df.Age, [0,6,10], labels=['0-5', '6-10'],right=False)
"""
Age Age_Groups
0 0 0-5
1 1 0-5
2 2 0-5
3 3 0-5
4 4 0-5
5 5 0-5
6 6 6-10
7 7 6-10
8 8 6-10
9 9 6-10
10 10 NaN
"""
From the docs, left bounds are by default excluded, right included :
right : bool, optional Indicates whether the bins include the rightmost edge or not. If right == True (the default), then the bins [1,2,3,4] indicate (1,2], (2,3], (3,4].
Here (right = False
) 0,6 is on the contrary [,6)
.
Upvotes: 1