Issue implementing XGBoost Regressor

Question

I'm a beginner in Machine Learning and was trying to work with Abalone dataset. I tried to predict the age of the abalones (refer this for the dataset). I ran an XGBoost Regressor and the code worked fine when I implemented the following:

model=XGBRegressor(n_estimators=500,learning_rate=0.05)
model.fit(X_train,y_train)
X_train_preds = model.predict(X_train)
X_test_preds = model.predict(X_test)

But when I add some early stopping rounds, it stops working:

model=XGBRegressor(n_estimators=500,learning_rate=0.05)
model.fit(X_train,y_train, early_stopping_rounds=5, eval_set=([X_test,y_test]))
X_train_preds = model.predict(X_train)
X_test_preds = model.predict(X_test)

and gives the following error:

Traceback (most recent call last):

  File "", line 1, in 
    runfile('C:/Users/dell/.spyder-py3/Abalone_project.py', wdir='C:/Users/dell/.spyder-py3')

  File "E:\l\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 705, in runfile
    execfile(filename, namespace)

  File "E:\l\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "C:/Users/dell/.spyder-py3/Abalone_project.py", line 47, in 
    model.fit(X_train,y_train, early_stopping_rounds=5, eval_set=([X_test,y_test]), verbose=False)

  File "E:\l\Anaconda3\lib\site-packages\xgboost\sklearn.py", line 370, in fit
    for i in range(len(eval_set)))

  File "E:\l\Anaconda3\lib\site-packages\xgboost\sklearn.py", line 370, in 
    for i in range(len(eval_set)))

  File "E:\l\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2685, in __getitem__
    return self._getitem_column(key)

  File "E:\l\Anaconda3\lib\site-packages\pandas\core\frame.py", line 2692, in _getitem_column
    return self._get_item_cache(key)

  File "E:\l\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2486, in _get_item_cache
    values = self._data.get(item)

  File "E:\l\Anaconda3\lib\site-packages\pandas\core\internals.py", line 4115, in get
    loc = self.items.get_loc(item)

  File "E:\l\Anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3065, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))

  File "pandas\_libs\index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item

  File "pandas\_libs\hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 0

Can someone please tell me what's causing the error and how to correct it?

Zabir Al Nazi Nabil · Accepted Answer

Try changing this line

model.fit(X_train,y_train, early_stopping_rounds=5, eval_set=([X_test,y_test]))

to

model.fit(X_train,y_train, early_stopping_rounds=5, eval_set=[(X_test,y_test)]

Your updated code which runs without error:

from xgboost import XGBRegressor

# dummy data

X_train = [[0,1], [1,2], [3,2]]
y_train = [0, 1, 0]

model=XGBRegressor(n_estimators=500,learning_rate=0.05)
model.fit(X_train,y_train, early_stopping_rounds=5, eval_set=[(X_train,y_train)])
X_train_preds = model.predict(X_train)

From documentation,

 eval_set(evals, iteration=0, feval=None)

    Evaluate a set of data.

    Parameters

            evals (list of tuples (DMatrix, string)) – List of items to be evaluated.

            iteration (int) – Current iteration.

            feval (function) – Custom evaluation function.

    Returns

        result – Evaluation result string.

evals (list of tuples (DMatrix, string)) – List of items to be evaluated. So, it takes a list of tuples, not the other way around.

Issue implementing XGBoost Regressor

Answers (1)

Related Questions