Upen Kambhampati
Upen Kambhampati

Reputation: 41

xgboost ranking objectives pairwise vs (ndcg & map)

Im using the xgboost to rank a set of products on product overview pages. Where relevance label here is how relevant the rating given in terms of popularity, profitability etc. The features are product related features like revenue, price, clicks, impressions etc.

I am aware that rank:pariwise, rank:ndcg, rank:map all implement LambdaMART algorithm, but they differ in how the model would be optimised.

Below is the details of my training set. 800 data points divided into two groups (type of products). Hence 400 data points in each group. The labels are from 0-3 where 0 is no relevance, 3 is the highest relevance.

x_train shape

(800, 10)

y_train shape

800

group_train

[400, 400]

labels [0., 1., 2., 3.]

Similarly, below is my validation set and test set. x_val shape

(400, 10)

y_val shape

400

group_val

[200, 200]

x_test shape

(160, 10)

y_test shape

160

below is the model parameters I'm initially trying out

params = {'objective': 'rank:pairwise', 'learning_rate': 0.1,
          'gamma': 1.0, 'min_child_weight': 0.1,
          'max_depth': 6, 'n_estimators': 100}
model = xgb.sklearn.XGBRanker(**params)
model.fit(x_train_sample, y_train_sample, group_train, verbose=False,
          eval_set=[(x_val_sample, y_val_sample)], eval_group=[group_val])

The predictons look like below, which is what i expect.

7.56624222e-01,  3.05949116e+00,  3.86625218e+00,  1.57079172e+00,
4.26489925e+00,  7.92866111e-01,  3.58812737e+00,  4.02488470e+00,
3.88625526e+00,  2.50904512e+00,  3.43187213e+00,  3.60899544e+00,
2.86354733e+00,  4.36567593e+00,  1.22325927e-01,  2.79849982e+00,

But when i change the objective to rank:ndcg

params = {'objective': 'rank:ndcg', 'learning_rate': 0.1,
          'gamma': 1.0, 'min_child_weight': 0.1,
          'max_depth': 6, 'n_estimators': 100}
model = xgb.sklearn.XGBRanker(**params)
model.fit(x_train_sample, y_train_sample, group_train, verbose=False,
          eval_set=[(x_val_sample, y_val_sample)], eval_group=[group_val])

My Predictions go completely strange.

[0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
   0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
   0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,

Could some one help to know, why ?

Upvotes: 4

Views: 6742

Answers (1)

Evelyn Omo
Evelyn Omo

Reputation: 89

I had the same issue at first. Removing gamma works for me. You can try

params = {'objective': 'rank:ndcg', 'learning_rate': 0.1,
          'min_child_weight': 0.1,
          'max_depth': 6, 'n_estimators': 100}
model = xgb.sklearn.XGBRanker(**params)
model.fit(x_train_sample, y_train_sample, group_train, verbose=False,
          eval_set=[(x_val_sample, y_val_sample)], eval_group=[group_val])```

Upvotes: 1

Related Questions