how predict no more than target example?

I have variable, and I need to predict its value as close as possible, but not greater than it. For example, given y_true = 9000, I want y_pred to be any value within range [0,9000] as close to 9000 as possible. And if y_true = 8000 respectively y_pred should be [0,8000]. That is, I want to make some kind of restriction on the predicted value. That threshold is individual for each pair of prediction and target variable from the sample. if y_true = [8750,9200,8900,7600] that y_pred should be [<=8750,<=9200,<=8900,<=7600]. The only task is to predict exactly no more and get closer. everywhere zero is considered the correct answer, but I just need to get as close as possible

data, target = np.array(data),np.array(df_tar)
X_train,X_test,y_train,y_test=train_test_split(data,target)
gbr = GradientBoostingRegressor(max_depth=1,n_estimators=100)
%time gbr.fit(X_train,np.ravel(y_train))
print(gbr.score(X_test,y_test),gbr.score(X_train,y_train))

Upvotes: 2

Views: 305

Answers (1)

Celius Stingher
Celius Stingher

Reputation: 18377

Due to the complexity of actually changing and coming up with a model that can take this approach you desire into sklearn's function and apply it, I strongly suggest you pass this filter after the prediction, and replace all predicted values over 9000 to 9000. And afterwards, manually compute the score, which I believe is mse in this scenario.

Here is a full workinge example of my approach:

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.metrics import mean_squared_error as mse
import numpy as np

X = [[8500,9500],[9200,8700],[8500,8250],[5850,8800]]
y = [8750,9200,8900,7600]
data, target = np.array(X),np.array(y)
gbr = GradientBoostingRegressor(max_depth=1,n_estimators=100)
gbr.fit(data,np.ravel(target))
predictions = gbr.predict(data)
print(predictions) ## The original predicitions

Output:

[8750.14958301 9199.23464805 8899.87846735 7600.73730159]

Perform the replacement:

fixed_predictions = np.array([z if y>z else y for y,z in zip(target,predictions)])
print(fixed_predictions)

[8750.         9199.23464805 8899.87846735 7600.        ]

Compute the new score:

score = mse(target,predictions)
print(score)

Output:

10000.145189724533

Upvotes: 2

Related Questions