k_bm
k_bm

Reputation: 91

How to reverse a seasonal log difference of timeseries in python

Could you please help me with this issue as I made many searches but cannot solve it. I have a multivariate dataframe for electricity consumption and I am doing a forecasting using VAR (Vector Auto-regression) model for time series. I made the predictions but I need to reverse the time series (energy_log_diff) as I applied a seasonal log difference to make the serie stationary, in order to get the real energy value:

df['energy_log'] = np.log(df['energy'])
df['energy_log_diff'] = df['energy_log'] - df['energy_log'].shift(1)

For that, I did first:

df['energy'] = np.exp(df['energy_log_diff']) 

This is supposed to give the energy difference between 2 values lagged by 365 days but I am not sure for this neither.

How can I do this?

Upvotes: 5

Views: 8010

Answers (2)

tjaqu787
tjaqu787

Reputation: 327

The reason we use log diff is that they are additive so we can use cumulative sum then multiply by the last observed value.

last_energy=df['energy'].iloc[-1]

df['energy']=(np.exp(df['energy'].cumsum())*last_energy)

As per seasonality: if you de-seasoned the log diff simply add(or multiply) before you do the above step if you de-seasoned the original series then add after

Upvotes: 4

Nerxis
Nerxis

Reputation: 3927

Short answer - you have to run inverse transformations in the reversed order which in your case means:

  1. Inverse transform of differencing
  2. Inverse transform of log

How to convert differenced forecasts back is described e.g. here (it has R flag but there is no code and the idea is the same even for Python). In your post, you calculate the exponential, but you have to reverse differencing at first before doing that.

You could try this:

energy_log_diff_rev = []
v_prev = v_0
for v in df['energy_log_diff']:
    v_prev += v
    energy_log_diff_rev.append(v_prev)

Or, if you prefer pandas way, you can try this (only for the first order difference):

energy_log_diff_rev = df['energy_log_diff'].expanding(min_periods=0).sum() + v_0

Note the v_0 value, which is the original value (after log transformation before difference), it is described in the link above.

Then, after this step, you can do the exponential (inverse of log):

energy_orig = np.exp(energy_log_diff_rev)

Notes/Questions:

  • You mention lagged values by 365 but you are shifting data by 1. Does it mean you have yearly data? Or would you like to do this - df['energy_log_diff'] = df['energy_log'] - df['energy_log'].shift(365) instead (in case of daily granularity of data)?
  • You want to get the reverse time series from predictions, is that right? Or am I missing something? In such a case you would make inverse transformations on prediction not on the data I used above for explanation.

Upvotes: 2

Related Questions