Implementing a machine learning algorithm in python when output is generated one-at-a-time

Question

I have a large, black-box model, which I am trying to calibrate, and I am trying to implement a basic machine-learning algorithm to assist the calibration, but I am getting stuck.

For this case, the model takes one input, called scalar, and generates an output which is a list of float-point values called y_pred.
I am trying to adjust scalar so that the output, y_pred is as close as possible to a known set of values called y_true.
I am measuring the difference between y_pred and y_true via the mean squared error.
Therefore, my input is scalar, and I want to minimise the mean squared error

The method I have been using to solve this, is as follows:

from sklearn.metrics import mean_squared_error

y_true = [1, 2, 3, 4]
list_of_scalars = []
list_of_results = []

for i in range(0, 2, 0.01):
    scalar = i
    list_of_scalars.append(scalar)
    y_pred = BlackBox.run(scalar)
    mse = mean_squared_error(y_true, y_pred)
    list_of_results.append(mse)

best_value = min(mse)
best_value_index = list_of_mse.index(best_value)
the_best_input = list_of_scalars[best_value_index]

This seems like a bad method, because it always takes the same amount of time, and assumes in advance that I know the best range that scalar will occupy. I could fine tune this method by trying to fit a line and retrieving the minimum value, but I'd still have these problems.

It seems that some kind of machine learning algorithm would be a better approach here. However, I'm not sure what type of algorithm would suit this problem? My intuition says a gradient descent, but I've not seen one implemented in this manner. The examples I've seen have a dataset before running the descent, rather than the data being generated on the fly.

My best guess is that such an algorithm would need to be aware of the gradient between the current mean_squared_error, and the previous mean_squared_error, and then adjust how much the scalar would increase or decrease in response to this.

My best guess at mapping this out, is as follows:

from sklearn.metrics import mean_squared_error

y_true = [1, 2, 3, 4]
scalar = 0.01  # Some arbitrarily small scalar value
mse = 9999999  # Some arbitrarily large mse
gradient = 2  # Some arbitrarily large gradient
threshold = 0.001  # The threshold under which the while loop will end

def some_algorithm(gradient, scalar) -> float:
    '''
    Takes the current gradient, and the current scalar, and determines how much to 
    adjust the scalar by
    '''
    ...
    return adjustment_factor

while gradient > threshold:
    y_pred = BlackBox(scalar)
    current_mse = mean_squared_error(y_true, y_pred)
    gradient = current_mse / mse
    adjustment_factor = some_algorithm(gradient, scalar)
    scalar *= adjustment_factor

I'm happy to use an out-of-the-box solution such as sklearn classes, but it is the implementation that I'm getting stuck on.

Implementing a machine learning algorithm in python when output is generated one-at-a-time

Answers (1)

Related Questions