Aggregate and count number of continuous time intervals in pandas

Question

I want to aggregate the continuous time intervals and count them for each person in pandas.

As an example I have dataframe that looks like this:

     Name      Time
0     Bob  10:59:00
1     Bob  11:00:00
2     Bob  11:01:00
3     Bob  11:14:00
4     Bob  11:15:00
5     Bob  11:17:00
6     Bob  11:18:00
7     Bob  11:19:00
8     Bob  13:10:00
9     Bob  13:10:00
10    Bob  13:15:00
11    Bob  13:16:00
12    Bob  13:17:00
13    Bob  13:18:00
14    Bob  13:19:00
15  Alice  13:17:00
16  Alice  13:18:00
17  Alice  13:19:00

With the following output:

Note: I only need the count not the time intervals themselves.

Bob = 5
1. 10:59 - 11:01
2. 11:14 - 11:15
3. 11:17 - 11:19
4. 13:10 - 13:10
5. 13:15 - 13:19

Alice = 1
1. 13:17 - 13:19

Valdi_Bo · Accepted Answer

I assume that your Time column is of string type.

You can use the following code:

result = df.groupby('Name', sort=False).apply(
    lambda grp: grp.groupby((pd.to_datetime(grp.Time).diff().dt
    .total_seconds() / 60).fillna(2).ge(2).cumsum())
    .Time.apply(lambda tt: f'{tt.iloc[0]} - {tt.iloc[-1]}'))\
    .rename_axis(['Name', 'Period'])

The result is:

Name   Period
Bob    1         10:59:00 - 11:01:00
       2         11:14:00 - 11:15:00
       3         11:17:00 - 11:19:00
       4         13:10:00 - 13:10:00
       5         13:15:00 - 13:19:00
Alice  1         13:17:00 - 13:19:00
Name: Time, dtype: object

The above code contains 2-level grouping:

First level - by Name,
Second level - by the number of a "period" - a sequence of consecutive time values.

From each 2nd level group there is taken the first and last time (separated with a minus).

The last part is to rename columns in the index, to better reflect the meaning of the data displayed.

If you want a printout looking almost exactly as in our post, run:

for key, grp in result.groupby(level=0, sort=False):
    print(f'{key} = {grp.size}')
    print(f'{grp.reset_index(level=0, drop=True).rename_axis(None).to_string()}
')

(try yourself to see the result).

This time there are no dots after the period number, but in my opinion it is not important.

Aggregate and count number of continuous time intervals in pandas

Answers (1)

Related Questions