Reputation: 81
How to use 'class_weights' while using CatboostClassifier for Multiclass problem. The documentation says it should be a list but In what order do I need to put the weights? I have a label array with 15 classes from -2 to +2 including decimal numbers, with class-0 having much higher density compared to the others. Please help. Thanks,
I tried for the binary class which is easier to work with but no clue about multiclass.
cb_model_step1 = run_catboost(X_train, y_train_new, X_test, y_test_new, n_estimators = 1000, verbose=100, eta = 0.3, loss_function = 'MultiClassOneVsAll', class_weights = counter_new)
cb = CatBoostClassifier(thread_count=4, n_estimators=n_estimators, max_depth=10, class_weights = class_weights, eta=eta, loss_function = loss_function)
Upvotes: 8
Views: 18106
Reputation: 201
Now it is possible to pass a dictionary with labels and corresponding weights.
Suppose we have X_train, y_train and multiclassification problem. Then we can do the following
import numpy as np
from catboost import CatBoostClassifier
from sklearn.utils.class_weight import compute_class_weight
classes = np.unique(y_train)
weights = compute_class_weight(class_weight='balanced', classes=classes, y=y_train)
class_weights = dict(zip(classes, weights))
clf = CatBoostClassifier(loss_function='MultiClassOneVsAll', class_weights=class_weights)
clf.fit(X_train, y_train)
Upvotes: 20
Reputation: 21
you need to fit model without any weights on tour dataset, then run CatBoostClassifier().classes_. it will show you classes order in catboost:
model_multiclass = CatBoostClassifier(iterations=1000,
depth=4,
learning_rate=0.05,
loss_function='MultiClass',
verbose=True,
early_stopping_rounds = 200,
bagging_temperature = 1,
metric_period = 100)
model_multiclass.fit(X_train, Y_train)
model_multiclass.classes_
Result:['35мр', '4мр', 'вывод на ИП', 'вывод на кк', 'вывод на фл', 'транзит']
Upvotes: 2