Nyxynyx
Nyxynyx

Reputation: 63687

XGBoost use 3-Dimensional Input Containing Time Steps in Python?

I am trying to train a XGBRegressor on time series data reshaped to have time steps, so the resulting shape of X_train can be something like (12345, 5, 10) if there are 12345 samples, 10 features and a time step of 5.

However, when we try to train XGBRegressor using such a training data,

import xgboost as xgb
xgb = xgb.XGBRegressor()
xgbr.fit(X_train, y_train)

we get the error

ValueError: ('Expecting 2 dimensional numpy.ndarray, got: ', (12345, 5, 10))

What is the correct way of training XGBRegressor on training data containing time steps?

Upvotes: 1

Views: 3039

Answers (1)

yatu
yatu

Reputation: 88295

As clear from the error, you cannot directly fit and XGBRegressor with a 3D shaped array. Recurring to ML for time series problem is quite common, though you have to make sure you are feeding the regressor meaningful features, that capture the time dependency between samples.

So something you could start with for your particular problem, is to create a new feature indicating to which time step a sample belongs. That way, the decision trees will also learn the effect the time-dependant variable has in the target variable. Though the more meaningful features you can add the better, so you could also consider including features capturing other possible time-dependant statistics.

Upvotes: 1

Related Questions