Reputation: 3783
Suppose that I have a data-frame (DF). Index of this data-frame is timestamp from 11 AM to 6 PM every day and this data-frame contains 30 days. I want to group it every 30 minutes. This is the function I'm using:
out = DF.groupby(pd.Grouper(freq='30min'))
The start date of output is correct, but it considers the whole day (24h) for grouping. For example, In the new timestamp, I have something like this:
11:00:00
11:30:00
12:00:00
12:30:00
...
18:00:00
18:30:00
...
23:00:00
23:30:00
...
2:00:00
2:30:00
...
...
10:30:00
11:00:00
11:30:00
As a result, many outputs are empty because from 6:00 PM to 11 AM, I don't have any data.
Upvotes: 0
Views: 797
Reputation: 1821
As mentioned in comment to original post this is as expected. If you want to remove empty groups simply slice them afterwards. Assuming in this case you are using count to aggregate:
df = df.groupby(pd.Grouper(freq='30min')).count()
df = df[df > 0]
Upvotes: 0
Reputation: 862751
One possible solution should be DatetimeIndex.floor
:
out = DF.groupby(DF.index.floor('30min'))
Or use dropna
after aggregate function:
out = DF.groupby(pd.Grouper(freq='30min')).mean().dropna()
Upvotes: 1