Oleg Tarasenko
Oleg Tarasenko

Reputation: 9610

python pandas: Grouping dataframe by ranges

I have a dateframe object with date and calltime columns.

Was trying to build a histogram based on the second column. E.g. df.groupby('calltime').head(10).plot(kind='hist', y='calltime') Got the following: enter image description here The thing is that I want to get more details for the first bar. E.g. the range itself 0-2500 is huge, and all the data is hidden there... Is there a possibility to split group by smaller range? E.g. by 50, or something like that?

UPD

date calltime 0 1491928756414930 4643 1 1491928756419607 166 2 1491928756419790 120 3 1491928756419927 142 4 1491928756420083 121 5 1491928756420217 109 6 1491928756420409 52 7 1491928756420476 105 8 1491928756420605 35 9 1491928756420654 120 10 1491928756420787 105 11 1491928756420907 93 12 1491928756421013 37 13 1491928756421062 112 14 1491928756421187 41 15 1491928756421240 122 16 1491928756421375 28 17 1491928756421416 158 18 1491928756421587 65 19 1491928756421667 108 20 1491928756421790 55 21 1491928756421858 145 22 1491928756422018 37 23 1491928756422068 63 24 1491928756422145 57 25 1491928756422214 43 26 1491928756422270 73 27 1491928756422357 90 28 1491928756422460 72 29 1491928756422546 77 ... ... ... 9845 1491928759997328 670 9846 1491928759998255 372 9848 1491928759999116 659 9849 1491928759999897 369 9850 1491928760000380 746 9851 1491928760001245 823 9852 1491928760002189 634 9853 1491928760002869 335 9856 1491928760003929 4162 9865 1491928760009368 531

Upvotes: 0

Views: 57

Answers (1)

piRSquared
piRSquared

Reputation: 294218

use bins

s = pd.Series(np.abs(np.random.randn(100)) ** 3 * 2000)
s.hist(bins=20)

enter image description here

Or you can use pd.cut to produce your own custom bins.

pd.cut(
    s, [-np.inf] + [100 * i for i in range(10)] + [np.inf]
).value_counts(sort=False).plot.bar()

enter image description here

Upvotes: 1

Related Questions