fnaos
fnaos

Reputation: 151

How to perform a constrained optimization over a scaled regression model?

Supposing that I am applying a gaussian process regression to my data. Before fitting the model, I will perform some sort of feature engineering. After the model is fit, my goal is to now apply a minimization, which I intend to constrain for some values, over the curve in order to find the optimal X. However, here comes the questioning, if I am applying some sort of feature engineering to my data, and fitting the model to that particular set of data, when I apply the constrained optimization, how should I find out for which values I want to constraint it since I altered my input data. If that sounded confusing, the following explanation with some code might help:

Supposing I have the data:

# X (theta, alpha1, alpha2)
array([[ 9.07660169,  0.61485493,  1.70396493],
       [ 9.51498486, -5.49212002, -0.68659511],
       [10.45737558, -2.2739529 , -2.03918961],
       [10.46857663, -0.4587848 ,  0.54434441],
       [ 9.10133699,  8.38066374,  0.66538822],
       [ 9.17279647,  0.36327109, -0.30558115],
       [10.36532505,  0.87099676, -7.73775872],
       [10.13681026, -1.64084098, -0.09169159],
       [10.38549264,  1.80633583,  1.3453195 ],
       [ 9.72533357,  0.55861224,  0.74180309])

# y
array([4.93483686, 5.66226844, 7.51133372, 7.54435854, 4.92758927,
       5.0955348 , 7.26606153, 6.86027353, 7.36488184, 6.06864003])

Then I apply some sort of feature engineering, in that case, a simple MinMaxScaler:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)           
scaler = MinMaxScaler()           
scaler.fit(X_train)            
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

Then, I fit a model to my data:

kernel = C(1.0, (1e-4, 1e4))*RBF(10,(1e-3,1e3))
model = GaussianProcessRegressor(kernel = kernel, n_restarts_optimizer = 5,optimizer='fmin_l_bfgs_b')           
model.fit(X_train,y_train)

Now, I perform the constrained minimization of the fit model. Note that I constrain theta for a constant value equals nine. Therefore, the motivation of this post, as I am setting the theta constrained to a value based the sample before fitting the curve prior the feature engineering process.

bnds = np.array([(theta_bin,theta_bin),(data_alpha1_bin.min(),data_alpha1_bin.max()),
                 (data_alpha2_bin.min(), data_alpha2_bin.max())])        
x0 = [theta_bin,0,0]
residual_plant = minimize(lambda x: -model.predict(np.array([x])), x0, method='SLSQP',bounds=bnds)

To sum up, I need to minimize my machine learning fitted model, however I also need to feature scale it before fitting, as it is required for gaussian process. The problem, is that my minimization is constrained over a given constant value for one of the features(theta), then, how would I deal with the curve being fitted to scaled features and the constraint which I set based on the values prior the scalling.

Upvotes: 6

Views: 383

Answers (1)

igrinis
igrinis

Reputation: 13666

Once you have you scaler fitted, just keep using it. As your transformation is only scaling without rotation, the transformed theta coordinate will remain constant.

residual_plant = minimize(lambda x: -model.predict(scaler.transform(x)), scaler.transform(x0), method='SLSQP',bounds=bnds)

BTW you intended to write:

model.fit(X_train_scaled,y_train)

right? Otherwise you train on original coordinates without scaling. Which is also seem legit in this case. I don't see a real need for scaling. But I believe you need to add normalize_y=True to the GPR, as it assumes zero mean of the observed target values, and it is not the case according to the data sample you've provided.

Upvotes: 2

Related Questions