Reputation: 59
I have a pandas data frame with a column "DATE_TIME".
DATE_TIME | SAMPLE | VALUE |
---|---|---|
2020-12-10 10:52:48 | 1 | 3.22 |
2020-12-10 10:52:54 | 2 | 2.93 |
2020-12-10 10:53:00 | 3 | 2.27 |
I want to split the data frame into different data frames whenever the time gap is bigger than 5 minutes.
I found this post very useful, but is not solving my problem as it is not creating data frames. I cannot find the mistake:
all_data["DATE_TIME"] = pd.to_datetime(all_data["DATE_TIME"])
group_samples = (all_data["DATE_TIME"].dt.minute > (all_data["DATE_TIME"].dt.minute.shift() + 5)).cumsum()
grouped = all_data["DATE_TIME"].dt.minute.groupby(group_samples)
group_list = [g for k,g in grouped]
group_list[2]
Out[]
1097 53
1100 53
1103 53
1106 54
1109 54
1112 54
1115 54
1118 54
1121 54
1124 54
1127 55
1130 55
...
Thanks a lot for any help!
Upvotes: 4
Views: 1391
Reputation: 2647
grouped = all_data["DATE_TIME"].dt.minute.groupby(group_samples)
should just be grouped = all_data.groupby(group_samples)
The problem with the first is you're only grouping a series that has just the minutes. So the output will only have the minutes of the split dataframe.
Changing that to grouping the full dataframe by the group_samples
gives you all the columns in the output.
Upvotes: 1