Mark_Anderson
Mark_Anderson

Reputation: 1324

lightGBM classifier errors on class_weights

I want to apply weights to my classes in lgbm (ie. manually force the model to prefer certain categories). I can see what my categories are, but when I make a class weights dict using those categories the model errors with ValueError: Class label [somevalue] not present.

import lightgbm as lgbm
### Data prep
#[skipping as long & irrelevant -- only need to know classes for the question]#

### Get classes from data
model = lgbm.LGBMClassifier()
model.fit(X_train,y_train)
model.classes_

gives: array([ 100., 200., 300., 500., 600., 700., 800., 1000.])

apply the known classes to a class_weight dict as per documentation

class_weight (dict, 'balanced' or None, optional (default=None)) – Weights associated with classes in the form {class_label: weight}.[...]

model = lgbm.LGBMClassifier(class_weight = {100.:1,   200.:20,  300.:30,  500.:50,  600.:60,  700.:70,  800.:80,1000.:100} )
model.fit(X_train,y_train)

and we get the error: ValueError: Class label 100.0 not present.

error is repeated for the first element in the dictionary if we reorder or delete elements.

Upvotes: 1

Views: 3200

Answers (1)

Mark_Anderson
Mark_Anderson

Reputation: 1324

It looks like lightGBM doesn't take class_label values in the class_weight dictionary. Instead, it places your labels in ascending order and you have to refer to them by index according to that order.

so

class_weight = {100.:10,   200.:20,  300.:30,  500.:50,  600.:60,  700.:70,  800.:80,1000.:100}

becomes

class_weight = {0:10,   1:20,  2:30,  3:50,  4:60,  5:70,  6:80,7:100}

and the working code is:

model = lgbm.LGBMClassifier(class_weight = {0:1,   1:1,  2:30,  3:50,  4:60,  5:70,  6:80,7:100} )
model.fit(X_train,y_train)

Which is a real undocumented doozy of a behaviour and makes the code quite hard to understand...

Can anyone confirm that this the correct intended behaviour?

Upvotes: 2

Related Questions