Pandas.Series.ewm for calculating a RSI Indicator, with Wilders Moving Average

Question

I asked a question earlier today. I have found a partial Solution to it: pandas.Series.ewm

Data: I have a Series, see example

date    close   change  gain    loss    avg_gain    avg_loss    rs  rsi
1   54.8                            
2   56.8    2   2   0               
3   57.85   1.05    1.05    0               
4   59.85   2   2   0               
5   60.57   0.72    0.72    0               
6   61.1    0.53    0.53    0               
7   62.17   1.07    1.07    0               
8   60.6    -1.57   0   1.57                
9   62.35   1.75    1.75    0               
10  62.15   -0.2    0   0.2             
11  62.35   0.2 0.2 0               
12  61.45   -0.9    0   0.9             
13  62.8    1.35    1.35    0               
14  61.37   -1.43   0   1.43                
15  62.5    1.13    1.13    0   0.842857143 0.292857143 2.87804878  74.21383648
16  62.57   0.07    0.07    0   0.787653061 0.271938776 2.896435272 74.33551618
17  60.8    -1.77   0   1.77    0.731392128 0.378943149 1.930084053 65.87128622
18  59.37   -1.43   0   1.43    0.679149833 0.454018638 1.495863333 59.93370364
19  60.35   0.98    0.98    0   0.700639131 0.421588735 1.661901925 62.43287589
20  62.35    2      2       0   0.793450622 0.391475254 2.026821908 66.96204698
21  62.17   -0.18   0   0.18    0.736775577 0.376369879 1.957583798 66.18861651
22  62.55   0.38    0.38    0   0.711291607 0.349486316 2.035248805 67.05377173
23  64.55     2     2       0   0.803342207 0.324523008 2.475455322 71.22679168
24  64.37   -0.18   0   0.18    0.745960621 0.314199936 2.374159048 70.36298569
25  65.3    0.93    0.93    0   0.759106291 0.291757083 2.601843568 72.23644
26  64.42   -0.88   0   0.88    0.704884413 0.333774435 2.111858608 67.86486387
27  62.9    -1.52   0   1.52    0.654535526 0.418504832 1.563985589 60.99822072
28  61.6    -1.3    0   1.3     0.607782989 0.481468773 1.262351835 55.79821031
29  62.05   0.45    0.45    0   0.596512775 0.447078146 1.334247224 57.15963631
30  60.05   -2  0   2           0.55390472  0.558001136 0.992658768 49.81579304
31  59.7    -0.35   0   0.35    0.514340097 0.543143912 0.946968356 48.63809691
32  60.9    1.2 1.2 0           0.563315804 0.504347918 1.116919064 52.76153835
33  60.25   -0.65   0   0.65    0.523078961 0.514751638 1.016177361 50.40118893
34  58.27   -1.98   0   1.98    0.485716178 0.619412235 0.784156577 43.9511076
35  58.7    0.43    0.43    0   0.481736451 0.575168504 0.837557077 45.5799217
36  57.72   -0.98   0   0.98    0.447326705 0.60408504  0.740502868 42.54534031
37  58.1    0.38    0.38    0   0.442517654 0.560936108 0.788891369 44.09945638
38  58.2    0.1 0.1 0           0.418052108 0.520869243 0.802604709 44.52472054

Formulas in Excel

change: previous_close - current_close
gain: =IF(change>0; change; 0)
loss: =IF(change<0; -change; 0)
avg_gain (first row): =AVERAGE("over the first n gains")

5. avg_gain (every next row): =(prev. avg_gain * (n - 1) + current gain) / n

avg_loss (first row): =AVERAGE("over the first n gains")

7. avg_loss (every next row): =(prev. avg_loss * (n - 1) + current loss) / n

rs: avg_gain / avg_loss
rsi: 100-(100/(1+rs))

Formula in Python for change, gain and loss: (works perfectly fine)

delta = data.diff()
delta_up = delta.clip(lower=0)
delta_down = delta.clip(upper=0).abs()

Problem: I have to formulas but I cant combine them.

Formula for the first average gain/loss: (on its own it works)

The first Average needs to be an Simple Moving Average

first_avg_gain = delta_up.rolling(
    window=periods, min_periods=periods
).mean()[: periods + 1]

first_avg_loss = delta_down.rolling(
    window=periods, min_periods=periods
).mean()[: periods + 1]

Formula for the following average gains/losses:

This is calculated like that in Excel: (prev. avg_gain * (period_length - 1) + gain)/period_length

The Formula following should do the same

avg_gain = delta_up.ewm(
    com=periods - 1, adjust=False, min_periods=periods
).mean()

avg_loss = delta_down.ewm(
    com=periods - 1, adjust=False, min_periods=periods
).mean()

Formula in Python for rs and rsi:

rs = avg_gain / avg_loss
indicator = 100 - 100 / (1 + rs)

Question: How can I combine first_avg_gain and avg_gain / first_avg_loss and avg_loss?

I need to implement this code, but it doesnt work

for i, row in enumerate(df['avg_gain'].iloc[window_length+1:]):
    df['avg_gain'].iloc[i + window_length + 1] =\
        (df['avg_gain'].iloc[i + window_length] *
         (window_length - 1) +
         df['gain'].iloc[i + window_length + 1])\
        / window_length

for i, row in enumerate(df['avg_loss'].iloc[window_length+1:]):
    df['avg_loss'].iloc[i + window_length + 1] =\
        (df['avg_loss'].iloc[i + window_length] *
         (window_length - 1) +
         df['loss'].iloc[i + window_length + 1])\
        / window_length

Deusdeorum · Accepted Answer

I cannot reproduce your example with periods = 10. Feels like there's something fishy about your SMA. With periods = 14:

df = (df
    .assign(
        change = df.diff(),
        gain = lambda df_: df_.change.clip(0),
        loss = lambda df_: df_.change.clip(upper=0).abs(),
        avg_gain = lambda df_: df_.gain.rolling(window=14, min_periods=14).mean(),
        avg_loss = lambda df_: df_.loss.rolling(window=14, min_periods=14).mean(),
        rs = lambda df_: df_.avg_gain.div(df_.avg_loss),
        rsi = lambda df_: 100-100/(1+df_.rs)
    )
)

    close   change  gain    loss    avg_gain    avg_loss    rs  rsi
--- --- --- --- --- --- --- --- ---
--- --- --- --- --- --- --- --- ---
14  62.50   1.13    1.13    0.00    0.843   0.293   2.878   74.214
15  62.57   0.07    0.07    0.00    0.705   0.293   2.407   70.651
16  60.80   -1.77   0.00    1.77    0.630   0.419   1.503   60.041
17  59.37   -1.43   0.00    1.43    0.487   0.521   0.934   48.300
18  60.35   0.98    0.98    0.00    0.506   0.521   0.970   49.235
19  62.35   2.00    2.00    0.00    0.611   0.521   1.171   53.943
--- --- --- --- --- --- --- --- ---
--- --- --- --- --- --- --- --- ---

EDIT after updating question:

Turns out it's not only a simple SMA that's being calculated.

Suppose you have the frame with close vals:

df = pd.DataFrame({'close': {0: 54.8,1: 56.8,2: 57.85,3: 59.85,4: 60.57,5: 61.1,6: 62.17,7: 60.6,8: 62.35,9: 62.15,10: 62.35,11: 61.45,12: 62.8,13: 61.37,14: 62.5,15: 62.57,16: 60.8,17: 59.37,18: 60.35,19: 62.35,20: 62.17,21: 62.55,22: 64.55,23: 64.37,24: 65.3,25: 64.42,26: 62.9,27: 61.6,28: 62.05,29: 60.05,30: 59.7,31: 60.9,32: 60.25,33: 58.27,34: 58.7,35: 57.72,36: 58.1,37: 58.2}})

I created two helper functions to achieve a better flow, the first one is for the first average:

def get_first_avg(df_, col, periods):
    return (df_
        .assign(
            avg_ = df_[col].rolling(window=periods, min_periods=periods).mean(),
            avg = lambda ser_: np.where(ser_.index > periods, np.nan, ser_.avg_),
        )
        .rename(columns={'avg': f'avg_{col}'})
        .get([f'avg_{col}'])
    )

And the other helper function is for the rest of the averages (wilders?):

def get_avg(df_, col, periods):
    for i, row in enumerate(df_[f'avg_{col}'].iloc[periods+1:]):
        df_[f'avg_{col}'].iloc[i + periods + 1] = (
            (df_[f'avg_{col}'].iloc[i + periods] * (periods - 1) + 
            df_[f'{col}'].iloc[i + periods + 1]) / periods
        )

    return df_

So combining will give you what you want:

(df
    .assign(
        change = df.diff(),
        gain = lambda df_: df_.change.clip(0),
        loss = lambda df_: df_.change.clip(upper=0).abs(),
        avg_gain = lambda df_: df_.pipe(get_first_avg, col='gain', periods=14),
        avg_loss = lambda df_: df_.pipe(get_first_avg, col='loss', periods=14),
    )
    .pipe(get_avg, 'gain', 14)
    .pipe(get_avg, 'loss', 14)
    .assign(
        rs = lambda df_: df_.avg_gain.div(df_.avg_loss),
        rsi = lambda df_: 100-(100/(1+df_.rs))
    )
)

    close  change  gain  loss  avg_gain  avg_loss        rs        rsi
11  61.45   -0.90  0.00  0.90       NaN       NaN       NaN        NaN
12  62.80    1.35  1.35  0.00       NaN       NaN       NaN        NaN
13  61.37   -1.43  0.00  1.43       NaN       NaN       NaN        NaN
14  62.50    1.13  1.13  0.00  0.842857  0.292857  2.878049  74.213836
15  62.57    0.07  0.07  0.00  0.787653  0.271939  2.896435  74.335516
16  60.80   -1.77  0.00  1.77  0.731392  0.378943  1.930084  65.871286
17  59.37   -1.43  0.00  1.43  0.679150  0.454019  1.495863  59.933704
18  60.35    0.98  0.98  0.00  0.700639  0.421589  1.661902  62.432876
19  62.35    2.00  2.00  0.00  0.793451  0.391475  2.026822  66.962047

You should probably rewrite the f-strings there, but does the job.

Pandas.Series.ewm for calculating a RSI Indicator, with Wilders Moving Average

Answers (1)

EDIT after updating question:

Related Questions