Reputation: 63687
I am trying to train a XGBRegressor
on time series data reshaped to have time steps, so the resulting shape of X_train
can be something like (12345, 5, 10)
if there are 12345 samples, 10 features and a time step of 5.
However, when we try to train XGBRegressor
using such a training data,
import xgboost as xgb
xgb = xgb.XGBRegressor()
xgbr.fit(X_train, y_train)
we get the error
ValueError: ('Expecting 2 dimensional numpy.ndarray, got: ', (12345, 5, 10))
What is the correct way of training XGBRegressor
on training data containing time steps?
Upvotes: 1
Views: 3039
Reputation: 88295
As clear from the error, you cannot directly fit and XGBRegressor
with a 3D shaped array. Recurring to ML for time series problem is quite common, though you have to make sure you are feeding the regressor meaningful features, that capture the time dependency between samples.
So something you could start with for your particular problem, is to create a new feature indicating to which time step a sample belongs. That way, the decision trees will also learn the effect the time-dependant variable has in the target variable. Though the more meaningful features you can add the better, so you could also consider including features capturing other possible time-dependant statistics.
Upvotes: 1