John T. Miller
John T. Miller

Reputation: 21

XGBoost and Numpy Issue

I can't properly fit my data to XG Boost. Changing my data type does not help.

There are 1225 rows and 15 columns.

RangeIndex(start=0, stop=1225, step=1)

Other classification algorithms work fine but XG Boost gave me the error below after entering this code.

import xgboost as xgb

X_train, X_test, y_train, y_test = train_test_split(loans.index, loans.BAD, test_size=0.2, random_state=0)

train = xgb.DMatrix(X_train, label=y_train)
test = xgb.DMatrix(X_test, label=y_test)

param = {'max_depth':2, 'eta':1, 'objective':'binary:logistic' }

num_round = 2 bst = xgb.train(param, X_train, 10)

    ---------------------------------------------------------------------------
`TypeError                                 Traceback (most recent call last)
<ipython-input-117-378a1a19d4c9> in <module>
      1 param = {'max_depth':2, 'eta':1, 'objective':'binary:logistic' }
      2 num_round = 2
----> 3 bst = xgb.train(param, X_train, 10)

~\Anaconda3\lib\site-packages\xgboost\training.py in train(params, dtrain, num_boost_round, evals, obj, feval, maximize, early_stopping_rounds, evals_result, verbose_eval, xgb_model, callbacks)
    207                            evals=evals,
    208                            obj=obj, feval=feval,
--> 209                            xgb_model=xgb_model, callbacks=callbacks)
    210 
    211 

~\Anaconda3\lib\site-packages\xgboost\training.py in _train_internal(params, dtrain, num_boost_round, evals, obj, feval, xgb_model, callbacks)
     28             params += [('eval_metric', eval_metric)]
     29 
---> 30     bst = Booster(params, [dtrain] + [d[0] for d in evals])
     31     nboost = 0
     32     num_parallel_tree = 1

~\Anaconda3\lib\site-packages\xgboost\core.py in __init__(self, params, cache, model_file)
   1026         for d in cache:
   1027             if not isinstance(d, DMatrix):
-> 1028                 raise TypeError('invalid cache item: {}'.format(type(d).__name__), cache)
   1029             self._validate_features(d)
   1030 

TypeError: ('invalid cache item: Int64Index', [Int64Index([ 359,  745,  682,  903,  548,  906, 1040,  467,   85,  192,
            ...
             600, 1094,  599,  277, 1033,  763,  835, 1216,  559,  684],
           dtype='int64', length=980)])

Upvotes: 1

Views: 1253

Answers (1)

yatu
yatu

Reputation: 88226

When using the Learning API, xgboost.train expects a train DMatrix, whereas you're feeding it X_train. You should be using:

xgb.train(param, train)

Upvotes: 1

Related Questions