Reputation: 2024
I trained a model using xgboost v0.90 to be compatible with AWS SageMaker ML Engine. I am doing the usual encoding and hyper-parameter tuning. Some code below:
import pandas as pd
import pickle
from xgboost import XGBRegressor
from sklearn.model_selection import train_test_split, GridSearchCV, RandomizedSearchCV
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
# split df into train and test
X_train, X_test, y_train, y_test = train_test_split(df.iloc[:,0:21], df.iloc[:,-1], test_size=0.1)
X_train.shape
(1000,21)
# Encode categorical variables
cat_vars = ['cat1','cat2','cat3']
cat_transform = ColumnTransformer([('cat', OneHotEncoder(handle_unknown='ignore'), cat_vars)], remainder='passthrough')
encoder = cat_transform.fit(X_train)
X_train = encoder.transform(X_train)
X_test = encoder.transform(X_test)
X_train.shape
(1000,420)
# Define a xgboost regression model
model = XGBRegressor()
# Do hyper-parameter tuning
.....
# Fit model
model.fit(X_train, y_train)
# Forecast on test data
y_pred = model.predict(X_test, pred_contribs=True)
y_pred
I have installed SHAP and based on the documentation [1], .predict()
takes pred_contribs
argument. Traceback:
TypeError Traceback (most recent call last)
<ipython-input-1119-37e607e853fd> in <module>
1 # Forecast on test data
----> 2 y_pred = model.predict(X_test, pred_contribs=True)
3 y_pred
TypeError: predict() got an unexpected keyword argument 'pred_contribs'
[2]https://xgboost.readthedocs.io/en/latest/python/python_api.html#xgboost.Booster.predict
Upvotes: 2
Views: 6421
Reputation: 1890
XGBRegressor's predict does not have pred_contribs
parameter, it only has these parameters:
predict(self, X, output_margin=False, ntree_limit=None,
validate_features=True, base_margin=None, iteration_range=None)
The parameter pred_contribs
, that you are talking about is in xgboost.Booster.predict()
, which is a low level XGBoost, it has these parameter:
predict(self, data: xgboost.core.DMatrix, output_margin: bool = False,
ntree_limit: int = 0, pred_leaf: bool = False, pred_contribs: bool = False,
approx_contribs: bool = False, pred_interactions: bool = False,
validate_features: bool = True, training: bool = False,
iteration_range: Tuple[int, int] = (0, 0), strict_shape: bool = False)
So you have to use that:
data = xgboost.DMatrix(X, label = y)
model = xgboost.train({"learning_rate": 0.01, "max_depth": 4}, data)
model.predict(data, pred_contribs = True)
Upvotes: 1