(Python) Grouping intervals in pandas dataframe

Question

The dataframe consists of a lot of columns with the column 'sec_time' in seconds (type = float). I'm trying to group the intervals and count, so I used this code:

data.groupby(pd.cut(user_data['sec_time'],[0,60,120,180,240,300,360,420])).count()

The output looks something like

 (0, 60] 2 
 (60,120] 8
 ...
 (360,420] 13

I am getting the right output except I don't know how to add the last interval to 420+ so that I don't miss any values. How should I go about this?

Reza · Accepted Answer

You can add inf in your last bucket:

data = pd.DataFrame({'sec_time': np.random.randint(0, 1000, 30)})
data.groupby(pd.cut(data['sec_time'],[0,60,120,180,240,300,360,420, float('inf')])).count()

                sec_time
sec_time                
(0.0, 60.0]            4
(60.0, 120.0]          2
(120.0, 180.0]         0
(180.0, 240.0]         1
(240.0, 300.0]         1
(300.0, 360.0]         0
(360.0, 420.0]         1
(420.0, inf]          21

(Python) Grouping intervals in pandas dataframe

Answers (1)

Related Questions