S.Peters
S.Peters

Reputation: 185

Data frame shift functionality

I currently have two data frames and wish to create a third data frame based upon two conditions both being satisfied: TRUE if the corresponding value in data frame 1 has just broken through 2 AND the value of data frame 2 is <= 0.2, else FALSE.

df1

enter image description here

df2

enter image description here

And the code should create df3 as follows:

enter image description here

i.e condition is only TRUE for 5y on 13-aug-17 as 5y has broken above 2 AND the value in df2 <=0.2.

The purpose of this is to replicate what has been done here mean reversion strategy over an entire data frame, rather than just for a single time-series, i.e for each of the steps in the example do the process at the data frame, rather than time series level.

So the question is how to apply the shift function below at the dataframe rather than column level.

df4['short entry'] = ((df4.zScore < - entryZscore) & ( df4.zScore.shift(1) > - entryZscore)&(df4['hurst'] < hurstentry))

Upvotes: 1

Views: 99

Answers (2)

Mederic Fourmy
Mederic Fourmy

Reputation: 331

Short answer:

df3 = ((df1 > -entryZscore) & (df1.shift(1) < -entryZscore) & (df2 < hurstentry))

I am not sure to understand the mean reversion strategy problem, but if entryZscore and hurstentry are the same for all columns, you can try this:

import pandas as pd

date_index = pd.date_range('2017-08-10', '2017-08-13')
cols = ['2y','5y','10y']

df1 = pd.DataFrame([[1.3,1.3,1.3],[1.4,1.4,1.4],[1.9,2.1,1.9],[1.9,1.9,1.9]],
                   columns=cols, index=date_index)
df2 = pd.DataFrame([[0.3,0.3,0.3],[0.1,0.1,0.1],[0.1,0.1,0.1], [0.3,0.3,0.3]],
                   columns=cols, index=date_index)

entryZscore = -2
hurstentry = 0.2

df3 = ((df1 > -entryZscore) & (df1.shift(1) < -entryZscore) & (df2 < hurstentry))

Output:

            2y      5y      10y
2017-08-10  False   False   False
2017-08-11  False   False   False
2017-08-12  False   True    False
2017-08-13  False   False   False

Upvotes: 1

White
White

Reputation: 637

This might be something helpful

import pandas as pd
import numpy as np

df1 = pd.DataFrame([['11',1.3,1.3,1.3],['12',1.4,1.4,1.4],['13',1.9,2.1,1.9], ['14',1.9,1.9,1.9]])
df1.columns = ['date','2y','5y','10y']
df2 = pd.DataFrame([['11',0.3,0.3,0.3],['12',0.1,0.1,0.1],['13',0.1,0.1,0.1], ['14',0.3,0.3,0.3]])
df2.columns = ['date','2y','5y','10y']

df = pd.merge(df1, df2, on='date', suffixes=['_zscore','_hurst'])
entryZscore = -2
hurstentry = 0.2
for x in ['2y','5y','10y']:
    df[x+'_short'] = ((df[x+'_zscore'] > -entryZscore) & ( df[x+'_zscore'].shift(1) < -entryZscore)&(df[x+'_hurst'] < hurstentry))
entries = ['date'] + [x+'_short' for x in  ['2y','5y','10y']]
result = df[entries]
print result

output as

  date 2y_short 5y_short 10y_short
0   11    False    False     False
1   12    False    False     False
2   13    False     True     False
3   14    False    False     False

I don't recommend approach above. It seems a bit messy. If possible, you could have three data frame. Each frame will hold data for single duration. i.e. 1 data frame for 5 year related data, 1 for 10 year and 1 for 2 year. You could create a function to parse single data frame. In this way, you could just copy the code from the mean reversion sample. Understanding the math is more crucial than handling data in complex ways. Hope it helps.

Upvotes: 0

Related Questions