Reputation: 937
Hi I’ve launched a random forest over a dataset imported as df. Now I would like to export both results (0-1 prediction) and predicted probabilities ( a two dimensions array) and match them to my dataset df. Is that possible? Until now I figured out how to export in a separate way to csv. And yes, I am not a pandas expert yet. Any hint?
# Import the `RandomForestClassifier`
from sklearn.ensemble import RandomForestClassifier
# Create the target and features numpy arrays:
target = df["target"].values
features =df[["var1",
"var2","var3","var4","var5"]]
features_forest = features
# Building and fitting my_forest
forest = RandomForestClassifier(max_depth = 10, min_samples_split=2, n_estimators = 200, random_state = 1)
my_forest = forest.fit(features_forest, target)
# Print the score of the fitted random forest
print(my_forest.score(features_forest, target))
print(my_forest.feature_importances_)
results = my_forest.predict(features)
print(results)
predicted_probs = forest.predict_proba(features)
#predicted_probs = my_forest.predict_proba(features)
print(predicted_probs)
id_test = df['ID_CONTACT']
pd.DataFrame({"id": id_test, "relevance": results, "probs": predicted_probs }).to_csv('C:\Users\me\Desktop\python\data\submission.csv',index=False)
pd.DataFrame(predicted_probs).to_csv('C:\Users\me\Desktop\python\data\submission_2.csv',index=False)
Upvotes: 0
Views: 1220
Reputation: 42885
You should be able to
df['results] = results
df = pd.concat([df, pd.DataFrame(predicted_probs, columns=['Col_1', 'Col_2'])], axis=1)
Upvotes: 1