Reputation: 9165
I use the python implementation of XGBoost. One of the objectives is rank:pairwise
and it minimizes the pairwise loss (Documentation). However, it does not say anything about the scope of the output. I see numbers between -10 and 10, but can it be in principle -inf to inf?
Upvotes: 16
Views: 22193
Reputation: 1289
If I understand your questions correctly, you mean the output of the predict
function on a model fitted using rank:pairwise
.
Predict
gives the predicted variable (y_hat
).
This is the same for reg:linear
/ binary:logistic
etc. The only difference is that reg:linear
builds trees to Min(RMSE(y, y_hat))
, while rank:pairwise
build trees to Max(Map(Rank(y), Rank(y_hat)))
. However, output is always y_hat
.
Depending on the values of your dependent variables, output can be anything. But I typically expect output to be much smaller in variance vs the dependent variable. This is usually the case as it is not necessary to fit extreme data values, the tree just needs to produce predictors that are large/small enough to be ranked first/last in the group.
Upvotes: 6
Reputation: 71
It gives predicted score for ranking. However, the scores are valid for ranking only in their own groups. So we must set the groups for input data.
For esay ranking, refer to my project xgboostExtension
Upvotes: 7
Reputation: 1221
good question. you may have a look in kaggle competition:
Actually, in Learning to Rank field, we are trying to predict the relative score for each document to a specific query. That is, this is not a regression problem or classification problem. Hence, if a document, attached to a query, gets a negative predict score, it means and only means that it's relatively less relative to the query, when comparing to other document(s), with positive scores.
Upvotes: 8