Reputation: 1525
I running a binary classification in LightGBM using the training API and want to stop on a custom metric while still tracking one or more builtin metrics. It's not clear if this is possible, though.
Here we can disable the default binary_logloss
metric and only track our custom metric:
import lightgbm as lgb
def my_eval_metric(...):
...
d_train = lgb.Dataset(...)
d_validate = lgb.Dataset(...)
params = {
"objective": "binary",
"metric": "custom",
}
evals_result = {}
model = lgb.train(
params,
d_train,
valid_sets=[d_validate],
feval=my_eval_metric,
early_stopping_rounds=10,
evals_result=evals_result,
)
If instead we let metric
be default, we will also track binary_logloss
, but we will stop on both metrics instead of just on our custom metric:
params = {
"objective": "binary",
# "metric": "custom",
}
We can set first_metric_only
in the params
, but now we will stop only on binary_logloss
as, apparently, it's the first metric:
params = {
"objective": "binary",
"first_metric_only": True,
}
Other things that probably work but seem like a pain:
binary_logloss
and pass it as a custom evaluation metric in a list with my other custom metric and use first_metric_only
; however, it seems like I shouldn't have to do that.Things that don't work:
feval=[my_eval_metric, 'binary_logloss']
in the lgb.train
call. Complains that a string is not callable.metric: [my_eval_metric, 'binary_logloss']
in the params
set. Warns Unknown parameter: my_eval_metric
and then errors when training starts with ValueError: For early stopping, at least one dataset and eval metric is required for evaluation
.Am I missing something obvious or is this a small hole in the LightGBM API?
This is on version 3.2.1. On version 3.0.0, it seems like it's totally impossible to pass multiple custom evaluation metrics in the training API. I'm not sure with the sklearn API there.
Upvotes: 5
Views: 8302
Reputation: 2732
If you are asking "how do I perform early stopping based on a custom evaluation metric function?", that can be achieved by setting parameter metric
to the string "None"
. That will lead LightGBM to skip the default evaluation metric based on the objective function (binary_logloss
, in your example) and only perform early stopping on the custom metric function you've provided in feval
.
The example below, using lightgbm==3.2.1
and scikit-learn==0.24.1
on Python 3.8.8 reproduces this behavior.
import lightgbm as lgb
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=42)
dtrain = lgb.Dataset(
data=X_train,
label=y_train
)
dvalid = lgb.Dataset(
data=X_test,
label=y_test,
reference=dtrain
)
def _constant_metric(dy_pred,dy_true):
"""An eval metric that always returns the same value"""
metric_name = 'constant_metric'
value = 0.708
is_higher_better = False
return metric_name, value, is_higher_better
evals_result = {}
model = lgb.train(
params={
"objective": "binary",
"metric": "None",
"num_iterations": 100,
"first_metric_only": True,
"verbose": 0,
"num_leaves": 8
},
train_set=dtrain,
valid_sets=[dvalid],
feval=_constant_metric,
early_stopping_rounds=5,
evals_result=evals_result,
)
You can see in the logs that the custom metric function I've provided is evaluated against the validation set, and training stops after early_stopping_rounds
consecutive rounds without improvement.
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000846 seconds.
You can set `force_col_wise=true` to remove the overhead.
[1] valid_0's constant_metric: 0.708
Training until validation scores don't improve for 5 rounds
[2] valid_0's constant_metric: 0.708
[3] valid_0's constant_metric: 0.708
[4] valid_0's constant_metric: 0.708
[5] valid_0's constant_metric: 0.708
[6] valid_0's constant_metric: 0.708
Early stopping, best iteration is:
[1] valid_0's constant_metric: 0.708
Evaluated only: constant_metric
If you are asking "how do I provide a mix of built-in metrics and custom evaluation functions to lgb.train()
and get all metrics evaluated, but only use the custom one for early stopping?"...then yes, that is not supported as of lightgbm
3.2.1.
Upvotes: 14