Reputation: 2250

Target variable for time series analysis

I am working on a time series prediction for the first time and little confused about how to create the target variable. The data looks like:

I am trying to predict the percentage change in sales for 1st-quarter for customer A in 2019. One way I thought of deriving the target is rolling average of the past 3 months and shift of 1. After manipulation, it looks like:

But I am confused should I take an average of Jan, Feb, March for the target in April or average of Feb, March, April for the target in Jan?

Upvotes: 0

Answers (1)

razimbres

Reputation: 5015

The time series prediction is based on the principle of autocorrelation, like y from Xn to Xn+100 and Xn+time_lag to Xn+100+time_lag

You will notice that the bigger the time lag, the smaller in the autocorrelation and the worse will be the predictive power of your model:

If you create a rolling mean, you will lose information, creating a fuzzy target. I would use target itself for better predictions.

What I mean is that you use same variable target as x_train and y_train, creating a time lag between them.

Then you can use ARIMA, LSTM Neural Networks, Linear Regression, Neural Networks, Temporal Convolutional Networks to map from input to target.

Check the level of autocorrelation of your data:

from pandas.plotting import autocorrelation_plot
autocorrelation_plot(dataframe['target'])

Upvotes: 1

Target variable for time series analysis

Answers (1)

Related Questions