Reputation: 368
I am trying to group values in a pandas
df
for off time. Specifically, I to return values every 15min. I can do this using the following. But I'm hoping to return values for every 15min segment even if values don't appear in the df
. For these segments I was going to produce a ffill()
where the previous value would be allocated to that segment.
import pandas as pd
d = ({
'Time' : ['0/1/1900 8:00:00','0/1/1900 9:59:00','0/1/1900 10:00:00','0/1/1900 12:29:00','0/1/1900 12:30:00','0/1/1900 13:00:00','0/1/1900 13:02:00','0/1/1900 13:15:00','0/1/1900 13:20:00','0/1/1900 18:10:00','0/1/1900 18:15:00','0/1/1900 18:20:00','0/1/1900 18:25:00','0/1/1900 18:45:00','0/1/1900 18:50:00','0/1/1900 19:05:00','0/1/1900 19:07:00','0/1/1900 21:57:00','0/1/1900 22:00:00','0/1/1900 22:30:00','0/1/1900 22:35:00','1/1/1900 3:00:00','1/1/1900 3:05:00','1/1/1900 3:20:00','1/1/1900 3:25:00'],
'People' : [1,1,2,2,3,3,2,2,3,3,4,4,3,3,2,2,3,3,4,4,3,3,2,2,1],
})
df = pd.DataFrame(data = d)
df['Time'] = ['/'.join([str(int(x.split('/')[0])+1)] + x.split('/')[1:]) for x in df['Time']]
df['Time'] = pd.to_datetime(df['Time'], format='%d/%m/%Y %H:%M:%S')
df = df.groupby(df.Time.dt.floor('15T'))['People'].max()
print(df)
Time
1900-01-01 08:00:00 1
1900-01-01 09:45:00 1
1900-01-01 10:00:00 2
1900-01-01 12:15:00 2
1900-01-01 12:30:00 3
1900-01-01 13:00:00 3
1900-01-01 13:15:00 3
1900-01-01 18:00:00 3
1900-01-01 18:15:00 4
1900-01-01 18:45:00 3
1900-01-01 19:00:00 3
1900-01-01 21:45:00 3
1900-01-01 22:00:00 4
1900-01-01 22:30:00 4
1900-01-02 03:00:00 3
1900-01-02 03:15:00 2
This series only provides groups that are present in the df. Is it possible to determine values for every 15min segment by forward filling from the previous group. Can the following function be altered to assign values to those segments?
df = df.groupby(df.Time.dt.floor('H'))['People'].max()
Upvotes: 3
Views: 2846
Reputation: 323226
If you want to groupby
you need Grouper
df.groupby(pd.Grouper(freq='15T',key='Time'))['People'].max().ffill()
Upvotes: 5
Reputation: 57033
You can resample the result at 15 min frequency and forward fill the gaps:
df.set_index('Time').resample('15T').max().ffill().astype(int)
Upvotes: 4