CpF
CpF

Reputation: 59

How to split pandas dataframe based on time gaps

I have a pandas data frame with a column "DATE_TIME".

DATE_TIME SAMPLE VALUE
2020-12-10 10:52:48 1 3.22
2020-12-10 10:52:54 2 2.93
2020-12-10 10:53:00 3 2.27

I want to split the data frame into different data frames whenever the time gap is bigger than 5 minutes.

I found this post very useful, but is not solving my problem as it is not creating data frames. I cannot find the mistake:

all_data["DATE_TIME"] = pd.to_datetime(all_data["DATE_TIME"])

group_samples = (all_data["DATE_TIME"].dt.minute > (all_data["DATE_TIME"].dt.minute.shift() + 5)).cumsum()
grouped = all_data["DATE_TIME"].dt.minute.groupby(group_samples)
group_list = [g for k,g in grouped]
group_list[2]

Out[]
1097    53
1100    53
1103    53
1106    54
1109    54
1112    54
1115    54
1118    54
1121    54
1124    54
1127    55
1130    55
...

Thanks a lot for any help!

Upvotes: 4

Views: 1391

Answers (1)

Asish M.
Asish M.

Reputation: 2647

grouped = all_data["DATE_TIME"].dt.minute.groupby(group_samples) should just be grouped = all_data.groupby(group_samples)

The problem with the first is you're only grouping a series that has just the minutes. So the output will only have the minutes of the split dataframe.

Changing that to grouping the full dataframe by the group_samples gives you all the columns in the output.

Upvotes: 1

Related Questions