Grouping multiple datetime dataframe rows into a single one

Question

I have a dataframe with some date time interval. I am trying to squash them into single events. I have the start and end time alongside the respective duration for every event.

What I have

           start_time            end_time  duration  id
0 2020-01-01 00:00:00 2020-01-01 00:30:00        30   A
1 2020-01-01 00:30:00 2020-01-01 01:00:00        30   B
2 2020-01-01 01:00:00 2020-01-01 01:30:00        30   C
3 2020-01-01 01:30:00 2020-01-01 02:00:00        30   D
4 2020-01-04 05:00:00 2020-01-04 05:30:00        30   E
5 2020-01-04 05:30:00 2020-01-04 06:00:00        30   F
6 2020-01-04 06:00:00 2020-01-04 06:30:00        30   G
7 2020-01-04 06:30:00 2020-01-04 07:00:00        30   H
8 2020-01-04 20:30:00 2020-01-04 21:00:00        30   I

What I'm trying to squash it into

           start_time            end_time  duration  id
0 2020-01-01 00:00:00 2020-01-01 02:00:00       120   A
4 2020-01-04 05:00:00 2020-01-04 07:00:00       120   E
8 2020-01-04 20:30:00 2020-01-04 21:00:00        30   I

I looked for group and merging options in pandas but I didn't manage to to what I want.

ansev · Accepted Answer

`Groupby.agg` with `Series.dt.date`

 new_df =( df.groupby(df['end_time'].dt.date,as_index = False)
             .agg({'start_time':'first',
                    'end_time':'last',
                    'duration':'sum',
                    'id':'first'})
         )
print(new_df)

           start_time            end_time  duration id
0 2020-01-01 00:00:00 2020-01-01 02:00:00       120  A
1 2020-01-04 05:00:00 2020-01-04 07:00:00       120  E

Grouping multiple datetime dataframe rows into a single one

Answers (2)

`Groupby.agg` with `Series.dt.date`

Related Questions

Grouping multiple datetime dataframe rows into a single one

Answers (2)

Groupby.agg with Series.dt.date

Related Questions

`Groupby.agg` with `Series.dt.date`