Reputation: 1250
I was running lightgbm
with categorical features:
X_train, X_test, y_train, y_test = train_test_split(train_X, train_y, test_size=0.3)
train_data = lgb.Dataset(X_train, label=y_train, feature_name=X_train.columns,
categorical_feature=cat_features)
test_data = lgb.Dataset(X_test, label=y_train, reference=train_data)
param = {'num_trees': 4000, 'objective':'binary', 'metric': 'auc'}
bst = lgb.train(param, train_data, valid_sets=[test_data], early_stopping_rounds=100)
Turns out the Error:
if self.handle is not None and feature_name is not None and feature_name != 'auto':
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I checked the other similar errors on stackoverflow mostly related to numpy
, and I then checked documentation and tried to replace my categorical_feature
with index like [0, 2, 5, ...]
(my original was column names of categorical features), still the same error.
I also tried replacing label
with the column index, still error.
Anyone could help? Thanks in advance.
Upvotes: 8
Views: 5042
Reputation: 1250
I also find that drop feature_name
works.
train_data = lgb.Dataset(X_train, label=y_train, categorical_feature=cat_features)
test_data = lgb.Dataset(X_test, label=y_test, reference=train_data)
param = {'num_trees': 4000, 'objective':'binary', 'metric': 'auc'}
bst = lgb.train(param, train_data, valid_sets=[test_data], early_stopping_rounds=100)
Upvotes: 0
Reputation: 3223
I think, there is an issue with the way how you pass feature_name
. The constructor expects a list, and oyu pass it pandas.core.indexes.base.Index
. The problem is that on such object feature_name != 'auto'
condition from the if statement that the error mentions acts element-wise. Thus the or
tries to join a bool
and numpy.ndarray
.
A simple solution would be either to convert to a list (feature_name=X_train.columns.tolist()
) or to use feature_name='auto'
, which will the name extraction from a pd.DataFrame
internally
Upvotes: 8