Reputation: 55
We have hourly time series data having 2 columns, one is the timestamp and other is the error rate. We used H2O deep-learning model to learn and predict future error-rate but looks like it requires at least 2 features (except timestamp) for creating the model.
Is there any way h2o can learn this type of data (time, value) having only one feature and predict the value given future time?
Upvotes: 1
Views: 2217
Reputation: 447
I have tried to use many of the default methods inside H2O with time series data. If you treat the system as a state machine where the state variables are a series of lagged prior states, it's possible, but not entirely effective as the prior states don't maintain their causal order. One way to alleviate this is to assign weights to each lagged state set based on time past, similar to how an EMA gives precedence to more recent data.
If you are looking to see how easy or effective the DL/ML can be for a non-linear time series model, I would start with an easy problem to validate the DL approach gives any improvement over a simple 1 period ARIMA/GARCH type process.
I have used this technique, with varying success. What I have had success with is taking well known non linear time series models and improving their predictive qualities with additional factors using the the handcrafted non linear model as an input into the DL method. It seems that certain qualities that I haven't manually worked out about the entire parameter space are able to supplement a decent foundation.
The real question at that point is there is now an introduction of immense complexity that isn't entirely understood. Is that complexity warranted in the compiled landscape when the nonlinear model encapsulates about 95% of the information between the two stages?
Upvotes: 1
Reputation: 648
Interesting question,
I read about to declare other variables which represent previous values of the time series, similar to the methodology of regression in ARIMA models. But I'm not sure if this is a possible way to do it, so please correct me if I am wrong.
Consequently you could try to extend your dataset to something like this:
t value(t) value(t-1) value(t-2) value(t-3) ...
1 10 NA NA NA ...
2 14 10 NA NA ...
3 27 14 10 NA ...
...
After this, value(t) is your response (output neuron) and the others are your predictor variables, each refering to an input neuron.
Upvotes: 2
Reputation: 8819
Not in the current release of H2O, but ARIMA models are in development. You can follow the progress here.
Upvotes: 2