Reputation: 650
Below is my demo dataframe:
df=pd.DataFrame({"a": np.random.randint(1, high=50, size=50)})
bins =np.arange(0,df['a'].max()+1,5).astype('int') # this range interval with 5
when i run below func i get range and its count like this.
df.a.value_counts(bins=bins,sort=False)
(-0.001, 5.0] 3
(5.0, 10.0] 2
(10.0, 15.0] 5
(15.0, 20.0] 3
(20.0, 25.0] 5
(25.0, 30.0] 10
(30.0, 35.0] 6
(35.0, 40.0] 6
(40.0, 45.0] 4
what i want is when i will give range say [20:50]
it will return the maximum count between it .
Here it is 10
also i want to know it is within [25:30]
. Also if possible real values between it or mean of it.
Upvotes: 0
Views: 245
Reputation: 150735
Try overlaps
method:
# the counts
counts = df.a.value_counts(bins=bins,sort=False)
# query interval
interval = pd.Interval(20,50)
counts.loc[counts.index.overlaps(interval)].idxmax()
Upvotes: 4