ikenas
ikenas

Reputation: 431

How to get the max value of 30 natural days? Pandas

I'm tring to find the price before 30 natural days and i can't just shift(-30) rows back and get the price because i only have 24 working days.

# my dataframe
        xtime               open    close
0       2004-03-17 22:00:00 3451.0  3452.0
1       2004-03-18 22:00:00 3459.0  3452.0
2       2004-03-21 22:00:00 3442.0  3400.0
3       2004-03-22 22:00:00 3397.0  3411.0
4       2004-03-23 22:00:00 3409.0  3392.0
... ... ... ...
4266    2020-11-15 22:00:00 6453.7  6492.9
4267    2020-11-16 22:00:00 6505.1  6514.0
4268    2020-11-17 22:00:00 6514.1  6510.6
4269    2020-11-18 22:00:00 6507.4  6565.6
4270    2020-11-19 22:00:00 6544.2  6559.5

What I'm tring to do:

from datetime import timedelta as td, datetime as dt

#here i'm making a days counter to count the number of days since 1970
df['day_num'] = list(map( lambda d: int(dt.timestamp(d) / (24*60*60))  , df['xtime'] ))

# 30 days before
df['d30'] = df['day_num'] - 30

this will help me to know the index of the nearest day in day_num... but after that i don't know how to continue.

I also tried this way to get the index but resultet to be very slow.

tm = df['xtime'].tolist()
ner = df['close'].tolist()

def nearest( days = 30 ):
    for i, x in enumerate(zip(tm,ner)):
        t, y = x
        gt = t-td(days=days)
        if i == 0:
           yield 0
        else:
           mdays = days - 10
           nt = tm.index(
             min(tm[i-days if i > days else 0:i], 
                 key = lambda f: abs(f - gt) 
           ))

           yield int(ner[i]==max(ner[nt:i+1]))

df['ne'] = list(nearest())

I also tried with Ta-Lib package talib.MAX() finction but it only rolls 30 rows up

Thank you in advance.

Upvotes: 0

Views: 570

Answers (1)

yulGM
yulGM

Reputation: 1094

The following will look back up to some date range, and will get the first available value within that range: (i.e. look up to 10 days back, and see first item recorded within that period). if there is nothing it will return existing value. if earliest record within that date range is fewer days back, that's the one that will return (i.e. if you look back 10 days, but only have value 5 days ago, that's the value you'd get)

Set your date to index. Then do df.rolling() looking back n days, and get the earliest value within that date range.

eg based on your data:

import pandas as pd

data = [{'xtime': '2004-03-17 22:00:00', 'open': 3451.0, 'close': 3452.0}, {'xtime': '2004-03-18 22:00:00', 'open': 3459.0, 'close': 3452.0}, {'xtime': '2004-03-21 22:00:00', 'open': 3442.0, 'close': 3400.0}, {'xtime': '2004-03-22 22:00:00', 'open': 3397.0, 'close': 3411.0}, {'xtime': '2004-03-23 22:00:00', 'open': 3409.0, 'close': 3392.0}, {'xtime': '2004-03-24 22:00:00', 'open': 3410.0, 'close': 3393.0}, {'xtime': '2004-03-25 22:00:00', 'open': 3411.0, 'close': 3394.0}, {'xtime': '2004-03-26 22:00:00', 'open': 3412.0, 'close': 3395.0}, {'xtime': '2004-03-27 22:00:00', 'open': 6453.7, 'close': 6492.9}, {'xtime': '2004-03-28 22:00:00', 'open': 6505.1, 'close': 6514.0}, {'xtime': '2004-03-29 22:00:00', 'open': 6514.1, 'close': 6510.6}, {'xtime': '2004-03-30 22:00:00', 'open': 6507.4, 'close': 6565.6}, {'xtime': '2004-03-31 22:00:00', 'open': 6544.2, 'close': 6559.5}]

df=pd.DataFrame(data)

## set 'xtime' to date type:
df['xtime']=pd.to_datetime(df['xtime'])

## set datetime to index:
df.set_index('xtime', drop=True, inplace=True)


## get minimum values from up to 5 days before
## (change 5d to whatever date range you need)
df_lookback = df.rolling('3d', min_periods=1).apply(lambda x: x[0])

## combine results and reset index:
df_combined = pd.merge(df, df_lookback.rename(columns={'open':'open_old','close':'close_old'}), on='xtime', how='left').reset_index()


print(df_combined)

OUTPUT:

    xtime               open    close   open_old close_old
0   2004-03-17 22:00:00 3451.0  3452.0  3451.0  3452.0
1   2004-03-18 22:00:00 3459.0  3452.0  3451.0  3452.0
2   2004-03-21 22:00:00 3442.0  3400.0  3451.0  3452.0
3   2004-03-22 22:00:00 3397.0  3411.0  3459.0  3452.0
4   2004-03-23 22:00:00 3409.0  3392.0  3442.0  3400.0
5   2004-03-24 22:00:00 3410.0  3393.0  3442.0  3400.0
6   2004-03-25 22:00:00 3411.0  3394.0  3442.0  3400.0
7   2004-03-26 22:00:00 3412.0  3395.0  3397.0  3411.0
8   2004-03-27 22:00:00 6453.7  6492.9  3409.0  3392.0
9   2004-03-28 22:00:00 6505.1  6514.0  3410.0  3393.0
10  2004-03-29 22:00:00 6514.1  6510.6  3411.0  3394.0
11  2004-03-30 22:00:00 6507.4  6565.6  3412.0  3395.0
12  2004-03-31 22:00:00 6544.2  6559.5  6453.7  6492.9

Upvotes: 1

Related Questions