Reputation: 431
I'm tring to find the price before 30 natural days and i can't just shift(-30) rows back and get the price because i only have 24 working days.
# my dataframe
xtime open close
0 2004-03-17 22:00:00 3451.0 3452.0
1 2004-03-18 22:00:00 3459.0 3452.0
2 2004-03-21 22:00:00 3442.0 3400.0
3 2004-03-22 22:00:00 3397.0 3411.0
4 2004-03-23 22:00:00 3409.0 3392.0
... ... ... ...
4266 2020-11-15 22:00:00 6453.7 6492.9
4267 2020-11-16 22:00:00 6505.1 6514.0
4268 2020-11-17 22:00:00 6514.1 6510.6
4269 2020-11-18 22:00:00 6507.4 6565.6
4270 2020-11-19 22:00:00 6544.2 6559.5
What I'm tring to do:
from datetime import timedelta as td, datetime as dt
#here i'm making a days counter to count the number of days since 1970
df['day_num'] = list(map( lambda d: int(dt.timestamp(d) / (24*60*60)) , df['xtime'] ))
# 30 days before
df['d30'] = df['day_num'] - 30
this will help me to know the index of the nearest day in day_num... but after that i don't know how to continue.
I also tried this way to get the index but resultet to be very slow.
tm = df['xtime'].tolist()
ner = df['close'].tolist()
def nearest( days = 30 ):
for i, x in enumerate(zip(tm,ner)):
t, y = x
gt = t-td(days=days)
if i == 0:
yield 0
else:
mdays = days - 10
nt = tm.index(
min(tm[i-days if i > days else 0:i],
key = lambda f: abs(f - gt)
))
yield int(ner[i]==max(ner[nt:i+1]))
df['ne'] = list(nearest())
I also tried with Ta-Lib package
talib.MAX()
finction
but it only rolls 30 rows up
Thank you in advance.
Upvotes: 0
Views: 570
Reputation: 1094
The following will look back up to some date range, and will get the first available value within that range: (i.e. look up to 10 days back, and see first item recorded within that period). if there is nothing it will return existing value. if earliest record within that date range is fewer days back, that's the one that will return (i.e. if you look back 10 days, but only have value 5 days ago, that's the value you'd get)
Set your date to index. Then do df.rolling() looking back n
days, and get the earliest value within that date range.
eg based on your data:
import pandas as pd
data = [{'xtime': '2004-03-17 22:00:00', 'open': 3451.0, 'close': 3452.0}, {'xtime': '2004-03-18 22:00:00', 'open': 3459.0, 'close': 3452.0}, {'xtime': '2004-03-21 22:00:00', 'open': 3442.0, 'close': 3400.0}, {'xtime': '2004-03-22 22:00:00', 'open': 3397.0, 'close': 3411.0}, {'xtime': '2004-03-23 22:00:00', 'open': 3409.0, 'close': 3392.0}, {'xtime': '2004-03-24 22:00:00', 'open': 3410.0, 'close': 3393.0}, {'xtime': '2004-03-25 22:00:00', 'open': 3411.0, 'close': 3394.0}, {'xtime': '2004-03-26 22:00:00', 'open': 3412.0, 'close': 3395.0}, {'xtime': '2004-03-27 22:00:00', 'open': 6453.7, 'close': 6492.9}, {'xtime': '2004-03-28 22:00:00', 'open': 6505.1, 'close': 6514.0}, {'xtime': '2004-03-29 22:00:00', 'open': 6514.1, 'close': 6510.6}, {'xtime': '2004-03-30 22:00:00', 'open': 6507.4, 'close': 6565.6}, {'xtime': '2004-03-31 22:00:00', 'open': 6544.2, 'close': 6559.5}]
df=pd.DataFrame(data)
## set 'xtime' to date type:
df['xtime']=pd.to_datetime(df['xtime'])
## set datetime to index:
df.set_index('xtime', drop=True, inplace=True)
## get minimum values from up to 5 days before
## (change 5d to whatever date range you need)
df_lookback = df.rolling('3d', min_periods=1).apply(lambda x: x[0])
## combine results and reset index:
df_combined = pd.merge(df, df_lookback.rename(columns={'open':'open_old','close':'close_old'}), on='xtime', how='left').reset_index()
print(df_combined)
OUTPUT:
xtime open close open_old close_old
0 2004-03-17 22:00:00 3451.0 3452.0 3451.0 3452.0
1 2004-03-18 22:00:00 3459.0 3452.0 3451.0 3452.0
2 2004-03-21 22:00:00 3442.0 3400.0 3451.0 3452.0
3 2004-03-22 22:00:00 3397.0 3411.0 3459.0 3452.0
4 2004-03-23 22:00:00 3409.0 3392.0 3442.0 3400.0
5 2004-03-24 22:00:00 3410.0 3393.0 3442.0 3400.0
6 2004-03-25 22:00:00 3411.0 3394.0 3442.0 3400.0
7 2004-03-26 22:00:00 3412.0 3395.0 3397.0 3411.0
8 2004-03-27 22:00:00 6453.7 6492.9 3409.0 3392.0
9 2004-03-28 22:00:00 6505.1 6514.0 3410.0 3393.0
10 2004-03-29 22:00:00 6514.1 6510.6 3411.0 3394.0
11 2004-03-30 22:00:00 6507.4 6565.6 3412.0 3395.0
12 2004-03-31 22:00:00 6544.2 6559.5 6453.7 6492.9
Upvotes: 1