Kelsey
Kelsey

Reputation: 89

How to inverse transform regression predictions after Lasso and RobustScalar?

I'm trying to figure out how to unscale my data (presumably using inverse_transform) for predictions after using RobustScalar and Lasso. The data below is just an example. My actual data is much larger and complicated, but I'm looking to use RobustScaler (as my data has outliers) and Lasso (as my data has dozens of useless features).

Basically, if I try to use this model to predict anything, I want that prediction in unscaled terms. When I try to do this with the example data point, I get an error that seems to want me to unscale data that is the same size as the training subset (aka two observations). I get the following error: ValueError: non-broadcastable output operand with shape (1,1) doesn't match the broadcast shape (1,2)

How can I unscale just one prediction? Is this possible?

import pandas as pd
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import RobustScaler

data = [[100, 1, 50],[500 , 3, 25],[1000 , 10, 100]]
df = pd.DataFrame(data,columns=['Cost','People', 'Supplies'])

X = df[['People', 'Supplies']]
y = df[['Cost']]

#Split
X_train,X_test,y_train,y_test = train_test_split(X,y)

#Scale data
transformer = RobustScaler().fit(X_train)
transformer.transform(X_train)

X_rtrain = RobustScaler().fit_transform(X_train)
y_rtrain = RobustScaler().fit_transform(y_train)
X_rtest = RobustScaler().fit_transform(X_test)
y_rtest = RobustScaler().fit_transform(y_test)

#Fit Train Model
lasso = Lasso()
lasso_alg = lasso.fit(X_rtrain,y_rtrain)

train_score =lasso_alg.score(X_rtrain,y_rtrain)
test_score = lasso_alg.score(X_rtest,y_rtest)

print ("training score:", train_score)
print ("test score:", test_score)

#Predict example 
example = [[10,100]]
transformer.inverse_transform(lasso_alg.predict(example).reshape(-1, 1))

Upvotes: 2

Views: 6039

Answers (1)

lrnzcig
lrnzcig

Reputation: 3947

You cannot use the same tranformer object for both X and y. In your snippet, your transformer is for X, which is 2D, thus you get an error when transforming the result of your prediction, which is 1D. (Actually you are lucky to get an error; if your X was 1D, you would get nonsense).

Something like this should work:

transformer_x = RobustScaler().fit(X_train)
transformer_y = RobustScaler().fit(y_train) 
X_rtrain = transformer_x.transform(X_train)
y_rtrain = transformer_y.transform(y_train)
X_rtest = transformer_x.transform(X_test)
y_rtest = transformer_y.transform(y_test)

#Fit Train Model
lasso = Lasso()
lasso_alg = lasso.fit(X_rtrain,y_rtrain)

train_score =lasso_alg.score(X_rtrain,y_rtrain)
test_score = lasso_alg.score(X_rtest,y_rtest)

print ("training score:", train_score)
print ("test score:", test_score)

example = [[10,100]]
transformer_y.inverse_transform(lasso.predict(example).reshape(-1, 1))

Upvotes: 2

Related Questions