Reputation: 115
I am working on Stacking Regressor from sklearn
and I used lightgbm
to train my model. My lightgbm
model has an early stopping option and I have used eval dataset and metric for this.
When it feeds into the StackingRegressor
, I saw this error
ValueError: For early stopping, at least one dataset and eval metric is required for evaluation
Which is frustrating because I do have them in my code. I wonder what is happening? Here's my code.
import numpy as np
import pandas as pd
import lightgbm as lgb
from sklearn.ensemble import AdaBoostRegressor
from sklearn.tree import DecisionTreeRegressor
import xgboost as xgb
from sklearn.ensemble import StackingRegressor
opt_parameters_LGBM = {'bagging_fraction': 0.37031434827212084, 'bagging_seed': 47, 'boosting_type': 'gbdt',
'feature_fraction': 0.3894822966866982, 'learning_rate': 0.01, 'max_bin': 177, 'max_depth': -1,
'metric': 'rmse', 'min_child_weight': 1000.0, 'num_leaves': 161, 'objective': 'regression',
'random_state': 47, 'reg_alpha': 10, 'reg_lambda': 50, 'verbosity': -1}
m1 = lgb.LGBMRegressor(valid_sets = [lgb_train, lgb_eval], verbose_eval = 30, num_boost_round = 10000, early_stopping_rounds = 10, n_jobs=4, n_estimators=3000, **opt_parameters_LGBM)
m1.fit(X_train_df, y_train_df, eval_set = (X_val_df, y_val_df), eval_metric = 'rmse')
opt_parameters_ADA = {'learning_rate': 0.03, 'n_estimators': 5}
m2 = AdaBoostRegressor(base_estimator=DecisionTreeRegressor(max_depth=3, min_samples_leaf=1, min_impurity_decrease=10, random_state=47), random_state=47, **opt_parameters_ADA)
m2.fit(X_train_df, y_train_df)
'''
Where problem starts
'''
gbm = xgb.XGBRegressor(
learning_rate = 0.02,
n_estimators= 5,
max_depth= 4,
min_child_weight= 2,
gamma=0.9,
subsample=0.8,
colsample_bytree=0.8,
objective= 'reg:squaredlogerror',
nthread= -1,
verbosity=3,
random_state=20)
estimators = [('lgbm', m1), ('ada', m2)]
gbm = StackingRegressor(estimators=estimators, final_estimator=gbm, cv=5, verbose=1)
gbm.fit(X_train_df, y_train_df)
Upvotes: 0
Views: 586
Reputation: 785
I guess the issue is causing by the fact that early_stopping
was used in the LGBMRegressor
, thus it expects eval data in StackingRegressor()
as well.
Just after the line you've fitted your LGBMRegressor()
model with the following line - m1.fit(X_train_df, y_train_df, eval_set = (X_val_df, y_val_df), eval_metric = 'rmse')
, add these lines after that.
params = m1.get_params()
# remove early_stopping_rounds as your model is already fitted the data
params["early_stopping_rounds"] = None
m1.set_params(**params)
see if the error goes away.
Upvotes: 2