Reputation: 6290
I would like to bin values into equally sized bins. Let's assume that we have the following Pandas Series:
ex = pd.Series([1,2,3,4,5,6,7,888,999])
Now, I would like to create three bins:
pd.cut(ex, 3, labels=False)
This results in three bins and the following bin number assigned to each element of the series:
[0,0,0,0,0,0,0,2,2]
Now, I would like to have the bin borders such that each bin has equal number of elements (i.e. 3) and the assigment of the data points to the bins should look like:
[0,0,0,1,1,1,2,2,2]
How can I avhieve this? And what should be done for tie breaking (i.e. when the number of data points is not divisble by the number of bins)?
Upvotes: 3
Views: 8901
Reputation: 11
Use pandas qcut function instead. Try this pd.qcut(ex,q=3,labels=False)
Upvotes: 1
Reputation: 9081
Use -
pd.qcut(ex, 3, labels=False)
Output
0 0
1 0
2 0
3 1
4 1
5 1
6 2
7 2
8 2
Use retbins=True
for getting the bins.
pd.qcut(ex, 3, labels=False, retbins=True)
Output
(0 0
1 0
2 0
3 1
4 1
5 1
6 2
7 2
8 2
dtype: int64,
array([ 1. , 3.66666667, 6.33333333, 999. ]))
Upvotes: 7
Reputation: 323226
Try with
bins = ex.index//3 # np.arange(len(ex))//3
bins
Out[98]: Int64Index([0, 0, 0, 1, 1, 1, 2, 2, 2], dtype='int64')
Upvotes: 0