Reputation: 43209
I am trying to bin a dataframe column that contains ages in the range 0 to 100. When I try and use a bin to include the zero ages it does not work.
Here is an demo using a list with the range of my data:
pd.cut(pd.Series(range(101)), [0, 24, 49, 74, 100])
The zero value in the range returns NaN from the cut.
Any way around this?
Upvotes: 6
Views: 13099
Reputation: 31672
IIUC you need to set include_lowest
argument to True
. From docs:
include_lowest : bool
Whether the first interval should be left-inclusive or not.
For your case:
pd.cut(pd.Series(range(101)), [0,24,49,74,100], include_lowest=True)
In [148]: pd.cut(pd.Series(range(101)), [0,24,49,74,100], include_lowest=True).head(10)
Out[148]:
0 [0, 24]
1 [0, 24]
2 [0, 24]
3 [0, 24]
4 [0, 24]
5 [0, 24]
6 [0, 24]
7 [0, 24]
8 [0, 24]
9 [0, 24]
dtype: category
Categories (4, object): [[0, 24] < (24, 49] < (49, 74] < (74, 100]]
Upvotes: 17