Reputation: 316
I use the mlp
and elm
functions from the nnfor
library for forecasting non-stationary time series. Both of them give different number of nodes in input and hidden layers. I am interested in how they choose the number of nodes in each layer and it would be great to understand the generalization error change the way it works in that functions.
Upvotes: 1
Views: 406
Reputation: 1833
The number of hidden nodes chosen by the mlp
function depends on the value of the hd.auto.type
parameter:
The number of hidden nodes tried for the "valid", "cv" and "elm" parameter values range from 1 to max(2, min(dim(X)[2] + 2, length(Y) - 2))
. These hidden nodes are limited to a single layer.
The "cv" and "valid" approaches use the minimum of the mean square error to find the number of hidden nodes.
As far as I can tell from the auto.hd.elm
function in the source code, the "elm" approach uses the median value of the number of significant model coefficients to choose the number of hidden nodes. Hope that makes sense to you!
The elm
function uses min(100 - 60*(type=="step" | type=="lm"),max(4, length(Y) - 2 - as.numeric(direct)*length(X[1,])))
to determine the number of hidden nodes. Where type
is estimation used for output layer weights and direct
is presence of direct input-output connections.
The number of input nodes depends on seasonality and lags.
Generalization error can be approximated using cross-validation. To be clear, this cross-validation would have to be done separately from any validation used to find the number of hidden nodes.
The nnfor package author has an introductory blog post which may be worth checking: http://kourentzes.com/forecasting/2017/02/10/forecasting-time-series-with-neural-networks-in-r/
Upvotes: 2