Loyal_Burrito
Loyal_Burrito

Reputation: 125

Trying to bring predictions back to respective row in dataframe

I am making predictions from my model in the form of a numpy ndarray. They are all correct with sigmoid output. I'd now like to plug back in every value in the numpy array back into the dataframe to its corresponding row as well as have a conditional saying if > .5 then == 1 else < .5 == 0

So far I can read the numpy array but I can't seem to add it to the dataframe correctly one by one

employers = data_churn
# employers = np.array([employers])
predictions = model_churn.predict(employers)
predictions

employerPredictions = real_churn
employerPredictions = employerPredictions.rename(index=str, columns={"main_reason": "churned"})
employerPredictions.drop(['biztype_knowledge','biztype_field','biztype_creative','PercentEmpChg','PercentChgRevenue','PercentChgPay','amountOfResignations','nb_months_active'], axis=1, inplace=True)
if predictions.any() > .5:
    employerPredictions['predictedChurn'] = 1
    employerPredictions['ConfidenceWillChurn %'] = round((predictions[0][0] * 100), 2)
else:
    employerPredictions['predictedChurn'] = 0
    employerPredictions['ConfidenceWillNotChurn %'] = round(((1 - predictions[0][0]) * 100), 2)


employerPredictions

So far the any method just returns the first prediction and sets it for all in the dataframe

Upvotes: 1

Views: 164

Answers (1)

Francesco Zambolin
Francesco Zambolin

Reputation: 601

How to round predictions to 1s and 0s:

employerPredictions['predictedChurn'] = np.round(predictions).astype(np.int8)

#Or you just downcast it to int
employerPredictions['predictedChurn'] = predictions.astype(np.int8)

#Or use np.where
employerPredictions['predictedChurn'] = np.where(predictions>=0.5,1,0)

As far as the ConfidenceWillChurn % or ConfidenceWillNotChurn % I would try like this, but I'm not sure this is what you're asking.

employerPredictions['ConfidenceWillChurn %'] = np.where(predictions>=0.5,predictions*100,np.nan)

employerPredictions['ConfidenceWillNotChurn %'] = np.where(predictions<0.5,(1-predictions)*100,np.nan)

I put np.nan but you can chose another value for when the condition is not satisfied. I used the where-method from numpy. Pandas has a where-method as well, but does something different.

Upvotes: 1

Related Questions