Reputation: 1324
I want to apply weights to my classes in lgbm (ie. manually force the model to prefer certain categories). I can see what my categories are, but when I make a class weights dict using those categories the model errors with ValueError: Class label [somevalue] not present.
import lightgbm as lgbm
### Data prep
#[skipping as long & irrelevant -- only need to know classes for the question]#
### Get classes from data
model = lgbm.LGBMClassifier()
model.fit(X_train,y_train)
model.classes_
gives:
array([ 100., 200., 300., 500., 600., 700., 800., 1000.])
apply the known classes to a class_weight dict as per documentation
class_weight (dict, 'balanced' or None, optional (default=None)) – Weights associated with classes in the form {class_label: weight}.[...]
model = lgbm.LGBMClassifier(class_weight = {100.:1, 200.:20, 300.:30, 500.:50, 600.:60, 700.:70, 800.:80,1000.:100} )
model.fit(X_train,y_train)
and we get the error:
ValueError: Class label 100.0 not present.
error is repeated for the first element in the dictionary if we reorder or delete elements.
Upvotes: 1
Views: 3200
Reputation: 1324
It looks like lightGBM doesn't take class_label
values in the class_weight
dictionary. Instead, it places your labels in ascending order and you have to refer to them by index according to that order.
so
class_weight = {100.:10, 200.:20, 300.:30, 500.:50, 600.:60, 700.:70, 800.:80,1000.:100}
becomes
class_weight = {0:10, 1:20, 2:30, 3:50, 4:60, 5:70, 6:80,7:100}
and the working code is:
model = lgbm.LGBMClassifier(class_weight = {0:1, 1:1, 2:30, 3:50, 4:60, 5:70, 6:80,7:100} )
model.fit(X_train,y_train)
Which is a real undocumented doozy of a behaviour and makes the code quite hard to understand...
Can anyone confirm that this the correct intended behaviour?
Upvotes: 2