JonTargaryen
JonTargaryen

Reputation: 1367

How to get accuracy in RandomForest Model in Python?

I got this script, that predict with RandomForest and LinearRegression the values for the seconds dataset.That works ok, the accuracy for the linear regression is 18% , too bad.

So Im trying with RandomForest, but I dont know how to calculate the accuracy of that model..

import pandas as pd

from sklearn.ensemble import RandomForestRegressor
from sklearn.datasets import make_regression

import numpy as np
import pandas as pd
import scipy
import matplotlib.pyplot as plt
from pylab import rcParams
import urllib
import sklearn
from sklearn.linear_model import RidgeCV, LinearRegression, Lasso

from sklearn.preprocessing import StandardScaler, PolynomialFeatures
from sklearn.model_selection import GridSearchCV

data = pd.read_csv('EncuestaVieja.csv')
X = data[['Edad','Sexo','v1','v2','v3']]
y = data['Alumna']

dataP = pd.read_csv('EncuestaVieja_test.csv')
X_p = dataP[['Edad','Sexo','v1','v2','v3']]
y_p = dataP['Alumna']

dataT = pd.read_csv('EncuestaVieja_test_2.csv')
X_t = dataT[['Edad','Sexo','v1','v2','v3']]
y_t = dataT['Alumna']
regr = linear_model.LinearRegression()

regr.fit(X, y)

lr = RandomForestRegressor(n_estimators=50)
lr.fit(X, y)

X_test = pd.read_csv('EncuestaNueva.csv')[['Edad','Sexo','v1','v2','v3']]

predictions = regr.predict(X_test)


predictions2 = lr.predict(X_test)
print( 'RandomForest Accuracy: ')
print(((predictions2)))
print( '')
from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_p,y_p)
accuracy = regressor.score(X_t,y_t)
print( 'Linear Regression Accuracy: ', accuracy*100,'%')
print(((predictions)))

OUTPUT:

RandomForest Accuracy: 
[ 1.64  2.54  2.6   2.38  1.64  1.32  1.68  2.56  3.    2.28  2.38  2.68
  2.9   2.5   2.78  1.96  1.56  2.6   2.12  2.76  2.74  1.66  1.68  2.12
  2.3   2.36  2.28  2.28  2.82  1.7   1.86  2.36  1.24]

Linear Regression Accuracy:  18.1336149086 %
[ 1.2681851   1.02802219  3.13377072  2.96885127  2.30808853  1.98814349
  2.39233726  2.8638321   1.86640316  2.63073399  2.21166731  2.25201016
  1.95065189  2.65360517  3.08855254  1.01229733  2.18225606  2.41802534
  2.43539473  2.50227407  1.71105799  1.88238089  2.12152321  3.33525397
  2.72820453  2.43241713  2.88757874  2.6242382   2.63087916  1.98379487
  2.25430306  1.96810279  0.8554685 ]

Upvotes: 3

Views: 22776

Answers (1)

Mitchell Leefers
Mitchell Leefers

Reputation: 312

I think this is handled with the score() method

lr.score(x_test, y_test)

This will return the R^2 value for your model. It looks like in your case you only have an x_test though. Note that this is not the accuracy. Regression models do not use accuracy like classification models. Instead different metrics are computed such as, mean square error or coefficient of determination. These metrics can show how accurately predicted values match known values or how closely a regression model fits a regression line.

Upvotes: 2

Related Questions