Ishan Dutta
Ishan Dutta

Reputation: 957

Python: LightGBM Hyperparameter Tuning Value Error

I have written the following code to perform RandomizedSearchCV on LightGBM Classifier Model, but I am getting the following error.

ValueError: For early stopping, at least one dataset and eval metric is required for evaluation

Code

import lightgbm as lgb
fit_params={"early_stopping_rounds":30, 
            "eval_metric" : 'f1', 
            "eval_set" : [(X_val,y_val)],
            'eval_names': ['valid'],
            'verbose': 100,
            # 'categorical_feature': 'auto'
            }

from scipy.stats import randint as sp_randint
from scipy.stats import uniform as sp_uniform
param_test ={'num_leaves': sp_randint(6, 50), 
             'min_child_samples': sp_randint(100, 500), 
             'min_child_weight': [1e-5, 1e-3, 1e-2, 1e-1, 1, 1e1, 1e2, 1e3, 1e4],
             'subsample': sp_uniform(loc=0.2, scale=0.8), 
             'colsample_bytree': sp_uniform(loc=0.4, scale=0.6),
             'reg_alpha': [0, 1e-1, 1, 2, 5, 7, 10, 50, 100],
             'reg_lambda': [0, 1e-1, 1, 5, 10, 20, 50, 100]}

n_HP_points_to_test = 100

from sklearn.model_selection import RandomizedSearchCV
#n_estimators is set to a "large value". The actual number of trees build will depend on early stopping and 5000 define only the absolute maximum
clf = lgb.LGBMClassifier(max_depth=-1, 
                         random_state=42, 
                         silent=True, 
                         metric='f1', 
                         n_jobs=4, 
                         n_estimators=5000,
                         )

gs = RandomizedSearchCV(
    estimator=clf, param_distributions=param_test, 
    n_iter=n_HP_points_to_test,
    scoring='f1',
    cv=3,
    refit=True,
    random_state=41,
    verbose=True)

gs.fit(X_trn, y_trn, **fit_params)
print('Best score reached: {} with params: {} '.format(gs.best_score_, gs.best_params_))

Tried Solutions
I have tried to implement the solutions given in the following links, but none of them worked. How to fix this?

  1. LightGBM error : ValueError: For early stopping, at least one dataset and eval metric is required for evaluation
  2. ValueError: For early stopping, at least one dataset and eval metric is required for evaluation #3028
  3. For early stopping, at least one dataset and eval metric is required for evaluation #1597

Upvotes: 1

Views: 1424

Answers (2)

pplonski
pplonski

Reputation: 5839

The F1 is not in built-in metric in LightGBM. You can easily add a custom eval_metric:

from sklearn.metrics import f1_score

def lightgbm_eval_metric_f1(preds, dtrain):
    target = dtrain.get_label()
    weight = dtrain.get_weight()

    unique_targets = np.unique(target)
    if len(unique_targets) > 2:
        cols = len(unique_targets)
        rows = int(preds.shape[0] / len(unique_targets))
        preds = np.reshape(preds, (rows, cols), order="F")

    return "f1", f1_score(target, preds, weight), True

Regarding optimization, I rather use native python API for LightGBM (lightgbm.train) with the Optuna framework, which works really well.

Optuna framework: https://github.com/optuna/optuna

But the easiest way to tune LightGBM with Optuna will be to use MLJAR AutoML (it has f1 metric built-in).


automl = AutoML(
    mode="Optuna"
    algorithms=["LightGBM"],
    optuna_time_budget=600, # 10 minutes for tuning 
    eval_metric="f1"
)
automl.fit(X, y)

MLJAR AutoML framework: https://github.com/mljar/mljar-supervised

If you want to check details of LightGBM+Optuna optimization in MLJAR here is the code https://github.com/mljar/mljar-supervised/blob/master/supervised/tuner/optuna/lightgbm.py

Upvotes: 1

Ben Reiniger
Ben Reiniger

Reputation: 12582

The last message in your third link (Feb 2020) suggests this error gets raised if the metric is not recognized, and indeed "f1" is not one of LGBM's builtin metrics. Either use one of their builtins (but you can still use F1 as the hyperparameter search's selection criterion), or create a custom metric (see the note at the end of the LGBMClassifier.fit method's documentation).

Upvotes: 0

Related Questions