Victor Johnzon
Victor Johnzon

Reputation: 23

how to predict from an array of data-python scikit learn pandas

I have found a code that will predict next values using python scikit-learn linear regression.

I am able to predict single data .. but actually I need to predict 6 values and print the prediction of six values.

Here is the code

def linear_model_main(x_parameters, y_parameters, predict_value):
    # Create linear regression object
    regr = linear_model.LinearRegression()<
    regr.fit(x_parameters, y_parameters)
    # noinspection PyArgumentList
    predict_outcome = regr.predict(predict_value)
    score = regr.score(X, Y)
    predictions = {'intercept': regr.intercept_, 'coefficient': regr.coef_,   'predicted_value': predict_outcome, 'accuracy' : score}
    return predictions

predicted_value = 9 #I NEED TO PREDICT 9,10,11,12,13,14,15

result = linear_model_main(X, Y, predicted_value)
print('Constant Value: {0}'.format(result['intercept']))
print('Coefficient: {0}'.format(result['coefficient']))
print('Predicted Value: {0}'.format(result['predicted_value']))
print('Accuracy: {0}'.format(result['accuracy']))

I tried doing like this:

predicted_value = {9,10,11,12,13,14,15}

result = linear_model_main(X, Y, predicted_value)
print('Constant Value: '.format(result['intercept']))
print('Coefficient: '.format(result['coefficient']))
print('Predicted Value: '.format(result['predicted_value']))
print('Accuracy: '.format(result['accuracy']))

error message is :

Traceback (most recent call last):
File "C:Python34\data\cp.py", line 28, in <module>
result = linear_model_main(X, Y, predicted_value)
File "C:Python34\data\cp.py", line 22, in linear_model_main
predict_outcome = regr.predict(predict_value)
File "C:\Python34\lib\site-packages\sklearn\linear_model\base.py", line 200,  in predict return self._decision_function(X)
 File "C:\Python34\lib\site-packages\sklearn\linear_model\base.py", line 183, in _decision_function
X = check_array(X, accept_sparse=['csr', 'csc', 'coo'])
File "C:\Python34\lib\site-packages\sklearn\utils\validation.py", line 393, in check_array array = array.astype(np.float64)
TypeError: float() argument must be a string or a number, not 'set'

C:\>

and

predicted_value = 9,10,11,12,13,14,15

result = linear_model_main(X, Y, predicted_value)
print('Constant Value: '.format(result['intercept']))
print('Coefficient: '.format(result['coefficient']))
print('Predicted Value: '.format(result['predicted_value']))
print('Accuracy: '.format(result['accuracy']))

got these errors

   C:\Python34\lib\site-packages\sklearn\utils\validation.py:386:    DeprecationWarnin
g: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0
.19. Reshape your data either using X.reshape(-1, 1) if your data has a   single feature or X.reshape(1, -1) if it contains a single sample.
 DeprecationWarning)
 Traceback (most recent call last):
  File "C:Python34\data\cp.py", line 28, in <module>
  result = linear_model_main(X, Y, predicted_value)
  File "C:Python34\data\cp.py", line 22, in linear_model_main
predict_outcome = regr.predict(predict_value)
    File "C:\Python34\lib\site-packages\sklearn\linear_model\base.py", line 200, in predict return self._decision_function(X)
    File "C:\Python34\lib\site-packages\sklearn\linear_model\base.py", line   185, in _decision_function dense_output=True) + self.intercept_
    File "C:\Python34\lib\site-packages\sklearn\utils\extmath.py", line 184, in safe_sparse_dot return fast_dot(a, b)
    ValueError: shapes (1,3) and (1,1) not aligned: 3 (dim 1) != 1 (dim 0)

C:\>

and if I make changes like this:

predicted_value = 9
result = linear_model_main(X, Y, predicted_value)
print('Constant Value: {1}'.format(result['intercept']))
print('Coefficient: {1}'.format(result['coefficient']))
print('Predicted Value: {}'.format(result['predicted_value']))
print('Accuracy: {1}'.format(result['accuracy']))

it will again give me error saying it crosses limit. What has to be done?

Upvotes: 2

Views: 11508

Answers (2)

Demetri Pananos
Demetri Pananos

Reputation: 7404

Here is a working example. I have not constructed your functions, just shown you the proper syntax. It looks like you aren't passing the data into fit properly.

import numpy as np
from sklearn import linear_model

x = np.random.uniform(-2,2,101)
y = 2*x+1 + np.random.normal(0,1, len(x))

#Note that x and y must be in specific shape.

x = x.reshape(-1,1)
y = y.reshape(-1,1)


LM  = linear_model.LinearRegression().fit(x,y) #Note I am passing in x and y in column shape

predict_me = np.array([ 9,10,11,12,13,14,15])

predict_me = predict_me.reshape(-1,1)

score = LM.score(x,y)


predicted_values = LM.predict(predict_me)

predictions = {'intercept': LM.intercept_, 'coefficient': LM.coef_,   'predicted_value': predicted_values, 'accuracy' : score}

Upvotes: 3

PabTorre
PabTorre

Reputation: 3127

Regr.predict() is expecting a set with the same shape as X. (read the docs) Instead of this you are proving a scalar value (9). This is why you are getting the error about the shape of the objects not matching.

You need to predict with an object that has the same number of "columns" as X (although not necessarily the same number of rows) and you'll get back a prediction for each row.

Your predict_value variable doesn't seem to do anything useful, since Y should contain the labels you are trying to predict for each of the rows of X.

Upvotes: 0

Related Questions