I am using the Tweedie objective function with lightGBM and have some questions: What is the loss function that lightGBM uses for Tweedie? How does it deal with predictions that are 0 in value as the mean_tweedie_deviance in sklearn asserts strictly positive truth and predictions? Is mean_tweedie_deviance the loss? I looked in the source code and it seems that the loss is just two terms from the deviance. Has anybody had experience with hyperparameter tuning using tweedie loss?

python-3.xmachine-learningglmlightgbmtweedie

Nik Berry

Reputation: 11

Tweedie deviance loss function in lightGBM

I am using the Tweedie objective function with lightGBM and have some questions:

What is the loss function that lightGBM uses for Tweedie?
How does it deal with predictions that are 0 in value as the mean_tweedie_deviance in sklearn asserts strictly positive truth and predictions?
Is mean_tweedie_deviance the loss?

I looked in the source code and it seems that the loss is just two terms from the deviance.

Has anybody had experience with hyperparameter tuning using tweedie loss?

Upvotes: 1

Answers (1)

Ramon Dalmau

Reputation: 381

never is too late to answer.

What is the loss function that lightGBM uses for Tweedie?

You can check the tweedie loss function implementation in the source code of lightGBM: https://github.com/microsoft/LightGBM/blob/1c27a15e42f0076492fcc966b9dbcf9da6042823/src/metric/regression_metric.hpp#L300-L318

How does it deal with predictions that are 0 in value as the mean_tweedie_deviance in sklearn asserts strictly positive truth and predictions?

As you can see in the previous link, if the raw score is clipped to 1e-10f in order to ensure strictly positive predictions. The truth can be 0. This is not a problem in the tweedie loss function

Is mean_tweedie_deviance the loss?

Not exactly. The tweedie losss function (since it is just that: a loss to minimise) ignores the constant of the mean_tweedie_deviance. If you observe carefully the definition of mean_tweedie_deviance (see oficial documentation in https://scikit-learn.org/stable/modules/model_evaluation.html#mean-tweedie-deviance ) you will notice that for p > 0 (the typical case when using the tweedie loss otherwise you are just assuming a normal distribution of your target), then there is a constant that does not depend on the prediction $\hat{y}$. It is pointless to attempt to minimise a constant. This is why I assume that lightGBM developers have just ignored this term. It should be noted, however, that the derivative of the tweedie loss function as implemented in lightGBM should match the derivative of the mean_tweedie_deviance, and this is what really matters :).

Upvotes: 1

Tweedie deviance loss function in lightGBM

Answers (1)

Related Questions