Reputation: 435
I am working on interpreting my XGBoost model. Take for example, the two datasets trainInput
and trainOutput
below, respectively:
df.trainInputs
input1 input2 input3
0 1 0 0
1 1 1 0
2 0 1 1
..
df.trainOutputs
output
0 1
1 0
2 1
...
The user inputs have been one-hot encoded and the output data is a list of user output patterns. I am training my XGBoost model with these and then predicting based on another matrix of one-hot encoded user input data from another dataset. I am hoping to retrieve a percentage score for each element of the column, but upon running my model, am only receiving binary output. Is there something that I am missing in building my model? The relevant code:
df.predictInputs
input1 input2 input3
0 1 1 0
1 1 0 0
2 1 0 1
..
model = xgb.XGBClassifier()
model.fit(trainInputs, trainOutput)
y_pred = model.predict(predictOutput)
Upvotes: 2
Views: 1797
Reputation: 2996
If you want to get the result probability (percentage score for each element), use predict_proba
instead of predict
.
Upvotes: 4