Reputation: 3797
******** updated the question, better code example and now doing it with list comprehension **********
I'm trying to get weekly rolling technical indicators using pandas and talib.
By "weekly rolling" I mean that if for example today is thursday, then the ADX weekly value of today is going to be calculated using only this thursday, the previous thursday and so forth. The previous ADX in the weekly ADX series was calculated only using Wednesdays, etc. So now, one day goes by and we are standing on friday, only fridays should be used to calculate the weekly ADX. Finally the ADX series is just all these ADX appended in one single series.
Currently I use a list comprehension that generates 5 lists that are inside "adxs_list" and each list is a day of the week. So for example adxs_list[0] shows the talib.ADX values calculated only with mondays, adxs_list[1], shows the talib.ADX values calculated only with tuesdays and so on.
Now I'm stuck when trying to put back these lists into the original dataframe. Was trying to do mash them together and then add them to the DataFrame but couldn't figure it out...
So question is, how can I join these calculations back to the original dataframe respecting the indexes of df?
import pandas as pd
import numpy as np
import talib
df = pd.DataFrame(np.random.randn(1000,4),
index=pd.date_range(pd.datetime(2000,3,30), freq='B', periods=1000),
columns =['PX_OPEN', 'PX_LAST', 'PX_HIGH', 'PX_LOW'] )
lista4 = ['W-MON','W-TUE','W-WED','W-THU','W-FRI']
adxs_list = [([talib.ADX(df['PX_HIGH'].resample(w).values,
df['PX_LOW'].resample(w).values, df['PX_LAST'].resample(w).values
, timeperiod=3)]) for w in lista4]
was trying to do it with:
adxs_frame = reduce(pd.DataFrame.combine_first,adxs_list)
And got this error:
TypeError: unbound method combine_first() must be called with DataFrame instance as first argument (got list instance instead)
Upvotes: 2
Views: 2274
Reputation: 3797
In the end, I think i figured it out. Had to transpose and then re-assign the original index. Not sure if its the fastest way of doing it but here it goes:
import pandas as pd
import numpy as np
import talib
df = pd.DataFrame(np.random.randn(100,4),
index=pd.date_range(pd.datetime(2000,3,30), freq='B', periods=100),
columns =['PX_OPEN', 'PX_LAST', 'PX_HIGH', 'PX_LOW'] )
lista3 = ['PX_OPEN', 'PX_LAST', 'PX_HIGH', 'PX_LOW']
lista4 = ['W-MON','W-TUE','W-WED','W-THU','W-FRI']
i0=[]
i1=[]
i2=[]
i3=[]
i4=[]
adxs_list = [([talib.ADX(df['PX_HIGH'].resample(w).values,
df['PX_LOW'].resample(w).values, df['PX_LAST'].resample(w).values
, timeperiod=3)]) for w in lista4]
# transposing the arrays and assigning them the original index of that week day
for u,v in [(u,v) for u,v in zip(range(5),lista4)]:
r = "i{0} = pd.DataFrame(adxs_list[{0}]).transpose().set_index(df.PX_OPEN.resample('{1}').index)".format(u,v)
exec r
# combining all the new dataframes into a single dataframe (respecting their indexes)
y0 = [i0, i1, i2, i3, i4]
i_frame = reduce(pd.DataFrame.combine_first, y0)
# merging this new dataframe into the original df
df = df.merge(i_frame, left_index=True, right_index=True)
# for some strange reason new column is named 0, so renaming it
names = df.columns.values
names[-1] = 'ADX_w'
df.columns = names
Upvotes: 1