Reputation: 566
I have the data:
index = pd.DatetimeIndex(['2000-01-01 08:00:00', '2000-01-01 08:00:01',
'2000-01-01 08:00:03', '2000-01-01 08:00:04',
'2000-01-01 08:00:05', '2000-01-01 08:00:06',
'2000-01-01 08:00:09', '2000-01-01 08:00:10',
'2000-01-01 08:00:11', '2000-01-01 08:00:12',
'2000-01-01 08:00:15', '2000-01-01 08:00:16',
'2000-01-01 08:00:18', '2000-01-01 08:00:19',
'2000-01-01 08:00:20', '2000-01-01 08:00:22',
'2000-01-01 08:00:23', '2000-01-01 08:00:24',
'2000-01-01 08:00:25', '2000-01-01 08:00:26',
'2000-01-01 08:00:27', '2000-01-01 08:00:28'])
df = pd.DataFrame({"A": [4,3,8,15,2,3,14,17,15,20,13,2,5,15,10,10,13,20,1,3,10,11]},
index=index)
df
I found that I can get the max
for the particular time interval by:
df2 = df.groupby(pd.Grouper(freq='5s', base=2)).rolling('5s', min_periods=1).max()
df2
I'd like to get the rolling max value of last 2 groups of 5s time interval for each record with respect of uncompleted fixed actual time interval.
I found that rolling max of the two fixed 5s intervals can be obtained by:
df3 = df.resample('5s', base=2).max().rolling(2).max()
But for example in record 2000-01-01 08:00:22
should be result 15
insted of 20
. I cannot find how to combine this two approach..
Result should be:
A last-2x-5s-intervals-max
2000-01-01 07:59:57 2000-01-01 08:00:00 4 NaN (not enough data)
2000-01-01 08:00:01 3 NaN (not enough data)
2000-01-01 08:00:02 2000-01-01 08:00:03 8 8.0 (max from previous 5s to actual record/to actual unfinished 5s time interval)
2000-01-01 08:00:04 15 15.0 (max from 2000-01-01 07:59:57 to 2000-01-01 08:00:04)
2000-01-01 08:00:05 2 15.0 (max from 2000-01-01 07:59:57 to 2000-01-01 08:00:05)
2000-01-01 08:00:06 3 15.0 (max from 2000-01-01 07:59:57 to 2000-01-01 08:00:06)
2000-01-01 08:00:07 2000-01-01 08:00:09 14 15.0 (max from 2000-01-01 08:00:02 to 2000-01-01 08:00:09)
2000-01-01 08:00:10 17 17.0 (max from 2000-01-01 08:00:02 to 2000-01-01 08:00:10)
2000-01-01 08:00:11 15 17.0 (max from 2000-01-01 08:00:02 to 2000-01-01 08:00:11)
2000-01-01 08:00:12 2000-01-01 08:00:12 20 20.0 (max from 2000-01-01 08:00:07 to 2000-01-01 08:00:12)
2000-01-01 08:00:15 13 20.0 (etc...)
2000-01-01 08:00:16 2 20.0
2000-01-01 08:00:17 2000-01-01 08:00:18 5 20.0
2000-01-01 08:00:19 15 20.0
2000-01-01 08:00:20 10 20.0
2000-01-01 08:00:22 2000-01-01 08:00:22 10 15.0
2000-01-01 08:00:23 13 15.0
2000-01-01 08:00:24 20 20.0
2000-01-01 08:00:25 1 20.0
2000-01-01 08:00:26 3 20.0
2000-01-01 08:00:27 2000-01-01 08:00:27 10 20.0
2000-01-01 08:00:28 11 20.0
Upvotes: 1
Views: 137
Reputation: 566
After some reasearch, and help from @Ben.T, this is the state I wanted to achieve. Choosable higher timeframe frequency and period:
HIGHER_TF_FREQ = '5s'
HIGHER_TF_PERIOD = 4
df2gr = df.groupby(pd.Grouper(freq=HIGHER_TF_FREQ, base=2))
df2 = df2gr.rolling(HIGHER_TF_FREQ, min_periods=1).max().dropna()
df2['last_2x'] = df2.assign(l=df2gr.max().dropna().rolling(HIGHER_TF_PERIOD-1).max().shift().loc[df2.index.get_level_values(0)].to_numpy()).max(axis=1, skipna=False)
df2
Upvotes: 0
Reputation: 29635
One way would be to get the cummax
per group on the first level of index in df2
to get the current maximum in the group, assign
the max
per group of first level of index once shift
to get the max of the previous group, then reindex using loc
with the first level of index of df2 for getting the right value at the right row, and the max
over the columns:
df2gr = df2.groupby(level=0)
df2['last_2x'] = df2gr.cummax()\
.assign(l=df2gr.max().shift()
.loc[df2.index.get_level_values(0)]
.to_numpy())\
.max(axis=1, skipna=False)
print (df2)
A last_2x
2000-01-01 07:59:57 2000-01-01 08:00:00 4.0 NaN
2000-01-01 08:00:01 4.0 NaN
2000-01-01 08:00:02 2000-01-01 08:00:03 8.0 8.0
2000-01-01 08:00:04 15.0 15.0
2000-01-01 08:00:05 15.0 15.0
2000-01-01 08:00:06 15.0 15.0
2000-01-01 08:00:07 2000-01-01 08:00:09 14.0 15.0
2000-01-01 08:00:10 17.0 17.0
2000-01-01 08:00:11 17.0 17.0
2000-01-01 08:00:12 2000-01-01 08:00:12 20.0 20.0
2000-01-01 08:00:15 20.0 20.0
2000-01-01 08:00:16 20.0 20.0
2000-01-01 08:00:17 2000-01-01 08:00:18 5.0 20.0
2000-01-01 08:00:19 15.0 20.0
2000-01-01 08:00:20 15.0 20.0
2000-01-01 08:00:22 2000-01-01 08:00:22 10.0 15.0
2000-01-01 08:00:23 13.0 15.0
2000-01-01 08:00:24 20.0 20.0
2000-01-01 08:00:25 20.0 20.0
2000-01-01 08:00:26 20.0 20.0
2000-01-01 08:00:27 2000-01-01 08:00:27 10.0 20.0
2000-01-01 08:00:28 11.0 20.0
Upvotes: 1