How to split pandas dataframe based on time gaps

Question

I have a pandas data frame with a column "DATE_TIME".

DATE_TIME	SAMPLE	VALUE
2020-12-10 10:52:48	1	3.22
2020-12-10 10:52:54	2	2.93
2020-12-10 10:53:00	3	2.27

I want to split the data frame into different data frames whenever the time gap is bigger than 5 minutes.

I found this post very useful, but is not solving my problem as it is not creating data frames. I cannot find the mistake:

all_data["DATE_TIME"] = pd.to_datetime(all_data["DATE_TIME"])

group_samples = (all_data["DATE_TIME"].dt.minute > (all_data["DATE_TIME"].dt.minute.shift() + 5)).cumsum()
grouped = all_data["DATE_TIME"].dt.minute.groupby(group_samples)
group_list = [g for k,g in grouped]
group_list[2]

Out[]
1097    53
1100    53
1103    53
1106    54
1109    54
1112    54
1115    54
1118    54
1121    54
1124    54
1127    55
1130    55
...

Thanks a lot for any help!

Asish M. · Accepted Answer

grouped = all_data["DATE_TIME"].dt.minute.groupby(group_samples) should just be grouped = all_data.groupby(group_samples)

The problem with the first is you're only grouping a series that has just the minutes. So the output will only have the minutes of the split dataframe.

Changing that to grouping the full dataframe by the group_samples gives you all the columns in the output.

How to split pandas dataframe based on time gaps

Answers (1)

Related Questions