Reputation: 524
There are two different guidelines on using customized loss function in xgboost.
If predicted probability ‘p’ = sigmoid(z)
2 . In https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html 1, gradient is w.r.t 'p’
Which is correct?
Upvotes: 1
Views: 2677
Reputation: 151
To keep this as general as possible, you need to calculate the gradient of the total loss function w.r.t changing the current predicted values. Normally, your loss function will be of the form $L = \sum_{i=1}^{N} \ell (y_{i}, \hat{y_{i}})$, in which $y_{i}$ is the label of the $i^{th}$ datapoint and $\hat{y_{i}}$ is your prediction (in the binary classification case, you might choose to define it such that $y_{i}$ are the binary labels, and $\hat{y_{i}}$ are the probabilities the classifier assigns to the label being one of the classes).
You then need to calculate $\frac{\partial\ell}{\hat{y_{i}}}\big|{y{i}}$
Upvotes: 1