Reputation: 401
Referring to this post:createTimeSlices function in CARET package in R where createTimeSlices was suggested as an option for cross-validating when using time series data. I would like to understand how to go about selecting values for 'initialWindow', 'horizon' and 'fixedWindow' in trainControl.
They are defined within caret as follows (?createTimeSlices):
initialWindow - The initial number of consecutive values in each training set sample
horizon - The number of consecutive values in test set sample
fixedWindow - A logical: if FALSE, the training set always start at the first sample.
Can someone please elaborate further on how to go about selecting the right values for initialWindow & Horizon and the actual implications of selecting TRUE or FALSE for fixedWindow?
Upvotes: 1
Views: 3986
Reputation: 3688
initialWindow
: The size of the training set/window for the first modeling iteration. How large this should be depends on the complexity of the model you are fitting, so you have to research what minimum sample size is expected to get a reliable fit. Obviously, a larger window is needed for more complex models, see for example Measuring forecast accuracy, p. 6.
fixedWindow
: If TRUE
this implies a moving window (always equal to the size of initialWindow
), if FALSE
it implies a growing window (in other words, it always starts at the first sample) that is used to fit the model. In the usual output of the models from caret you can observe the sizes of the training sample and whether it is growing or moving as in (fixedWindow = FALSE
, horizon = 1
):
Resampling: Rolling Forecasting Origin Resampling (1 held-out with no fixed window)
Summary of sample sizes: 100, 101, 102, 103, 104, 105, ...
horizon
: This defines for how many consecutive steps ahead the model is tested. The output of the caret model gives a summary of the model accuracy when predicting n steps ahead. The value that should be chosen here depends on your application, i.e. whether short-term or longer-term forecasts are desired. See again Measuring forecast accuracy, p. 7.
Upvotes: 2