rebrid
rebrid

Reputation: 440

Python optimization of prediction of random forest regressor

I have built a random forest regressor to predict the elasticity of a certain object based on color, material, size and other features.

The model works fine and I can predict the expected elasticity given certain inputs.

Eventually, I want to be able to find the lowest elasticity with certain constraints. The inputs have limited possibilities, i.e., material can only be plastic or textile.

I would like to have a smart solution in which I don't have to brute force and try all the possible combinations and find the one with lowest elasticity. I have found that surrogate models can be used for this but I don't understand how to apply this concept to my problem. For example, what is the objective function I should optimize in my case? I thought of passing the .predict() of the random forest but I'm not sure this is the correct way.

To summarize, I'd like to have a solution that given certain conditions, tells me what should be the best set of features to have lowest elasticity. Example, I'm looking for the lowest elasticity when the object is made of plastic --> I'd like to receive the set of other features that tells me how to get lowest elasticity in that case. Or simply, what feature I should tune to improve the performance

import numpy
from scipy.optimize import minimize
import random
from sklearn.ensemble import RandomForestRegressor

model = RandomForestRegressor(n_estimators=10, random_state=0)

model.fit(X_train,y_train)

material = [0,1]
size= list(range(1, 45))
color= list(range(1, 500))


def objective(x):
    
    material= x[0]
    size = x[1]
    color = x[2]
   
    return model.predict([[material,size,color]])

# initial guesses
n = 3
x0 = np.zeros(n)
x0[0] = random.choice(material)
x0[1] = random.choice(size)
x0[2] = random.choice(color)

# optimize
b = (None,None)
bnds = (b, b, b, b, b)
solution = minimize(objective, x0, method='nelder-mead',
                options={'xtol': 1e-8, 'disp': True})

x = solution.x

print('Final Objective: ' + str(objective(x)))

Upvotes: 1

Views: 683

Answers (1)

Damir Devetak
Damir Devetak

Reputation: 762

This is one solution if I understood you correctly,

import numpy as np
from sklearn.ensemble import RandomForestRegressor
from scipy.optimize import differential_evolution

model = None

def objective(x):

    material= x[0]
    size = x[1]
    color = x[2]
    
    return model.predict([[material,size,color]])

# define input data
material = np.random.choice([0,1], 10);  material = np.expand_dims(material, 1)
size     = np.arange(10);                size = np.expand_dims(size, 1)
color    = np.arange(20, 30);            color = np.expand_dims(color, 1)

input = np.concatenate((material, size, color), 1)  # shape = (10, 3)

# define output = elasticity between [0, 1] i.e. 0-100%
elasticity = np.array([0.51135295, 0.54830051, 0.42198349, 0.72614775, 0.55087905,
                       0.99819945, 0.3175208 , 0.78232872, 0.11621277, 0.32219236])

# model and minimize
model = RandomForestRegressor(n_estimators=100, random_state=0)
model.fit(input, elasticity)

limits = ((0, 1), (0, 10), (20, 30))

res = differential_evolution(objective, limits, maxiter = 10000, seed = 11111)

min_y = model.predict([res.x])[0]
print("min_elasticity ==", round(min_y, 5))

The output is minimal elasticity based on the limits

min_elasticity == 0.19029

These are random data so the RandomForestRegressor doesn't do the best job perhaps

Upvotes: 1

Related Questions