Adding Probability to the predicted value

Question

I have a testDF like this and try to make a binary classification [0;1]:

Also I have a trainDF with the same structured and with filled bad values in it for training purposes.

I make a target and train sets from trainDF:

target = trainDF.bad.values
train = trainDF.drop('bad', axis=1).values

Then I append the logistic regression model and do the cross validation:

model=[]
model.append (linear_model.LogisticRegression(C=1e5))
TRNtrain, TRNtest, TARtrain, TARtest = train_test_split(train, target,test_size=0.3, random_state=0)

Then fit on validated and do the preds:

model.fit(TRNtrain, TARtrain)
pred_scr = model.predict_proba(TRNtest)[:, 1]

Then fit on the whole set and predict bad value:

model.fit(train, target)
test = testDF.drop('bad', axis=1).values
testDF.bad=model.predict(test)

I receive a df with filled values:

My question: How can I add the probability from logistic regression of bad value=1 in additional column? What steps should I take for that?

Any help would be greatly appreciated!

James · Accepted Answer

The .predict method selects the most probable assignment for your input. If you want to access the probabilities you can use:

log_prob = model.predict_log_proba(test)  # Log of probability estimates.
prob = model.predict_proba(test)   # Probability estimates.

You can add either of these directly to the data frame via columnar assignment.

testDF['bad_prob'] = model.predict_proba(test)

Adding Probability to the predicted value

Answers (2)

Related Questions