Reputation:
I have a dataframe of TimeStamps and Bid prices. For every Bid price I want to create a rolling window of previous 1000 bid prices, find the highest value on this window and return the index of where the high was located. The index must be of the rolling window, so it'll be a number between 1 and 1000. Here's the code I came up with, it works:
df['High time'] = price_frame - ( df.index.values - df['Bid'].rolling(window=price_frame).agg(lambda x: x.index.values[np.argmax(x.values)]) )
As stated, it works but it's highly inefficient, takes about 15/20 seconds to compute. Is there a more efficient way to do this? Thank's
I expect a faster code, if possibile.
I have this dataframe:
Timestamp Bid
0 2023-12-01 00:00:02.033000+00:00 1.26305
1 2023-12-01 00:00:05.974000+00:00 1.26306
2 2023-12-01 00:00:07.587000+00:00 1.26305
3 2023-12-01 00:00:07.689000+00:00 1.26306
4 2023-12-01 00:00:07.792000+00:00 1.26304
... ... ...
2383418 2024-01-04 23:59:36.774000+00:00 1.26810
2383419 2024-01-04 23:59:41.153000+00:00 1.26808
2383420 2024-01-04 23:59:41.256000+00:00 1.26808
2383421 2024-01-04 23:59:55.951000+00:00 1.26805
2383422 2024-01-04 23:59:58.573000+00:00 1.26807
For each row I want to create a rolling window of 1000 previous bid prices, then find the max value of that window and return the index where the high value was. Using this code:
df['High time'] = price_frame - ( df.index.values - df['Bid'].rolling(window=price_frame).agg(lambda x: x.index.values[np.argmax(x.values)]) )
I obtain desired result:
Timestamp Bid High time
0 2023-12-01 00:00:02.033000+00:00 1.26305 NaN
1 2023-12-01 00:00:05.974000+00:00 1.26306 NaN
2 2023-12-01 00:00:07.587000+00:00 1.26305 NaN
3 2023-12-01 00:00:07.689000+00:00 1.26306 NaN
4 2023-12-01 00:00:07.792000+00:00 1.26304 NaN
... ... ... ...
2383418 2024-01-04 23:59:36.774000+00:00 1.26810 255.0
2383419 2024-01-04 23:59:41.153000+00:00 1.26808 254.0
2383420 2024-01-04 23:59:41.256000+00:00 1.26808 253.0
2383421 2024-01-04 23:59:55.951000+00:00 1.26805 252.0
2383422 2024-01-04 23:59:58.573000+00:00 1.26807 251.0
The problem is that it takes up to 20 seconds. Hence I wanted to know if there's a faster way.
Upvotes: 0
Views: 387
Reputation: 73
Does something like this do the trick?
df['High time'] = [window.to_list().index(max(window.to_list())) for window in df['Bid'].rolling(price_frame)]
Upvotes: 0
Reputation: 441
Probably not a lot faster, but is a bit simpler. idxmax() returns the index of the max value. Unfortunately it doesn't work directly with rolling, which is why apply I used apply.
df['High time'] = df['Bid'].rolling(price_frame).apply(lambda x: x.idxmax())
Upvotes: 0