kh2600
kh2600

Reputation: 3

Reconstructing Sklearn MLP Regression in MatLab

I am using Sklearn to train a MultiLayer Perceptron Regression on 12 features and one output. The StandardScalar() is fit to the training data and applied to all input data. After a training period with architectural optimization, I get a model that is seemingly quite accurate (<10% error). I now need to extract the weights and biases in order to implement the prediction in real time on a system that interacts with a person. This is being done with my_model.coefs_ for weights and my_model.intercepts_ for the biases. The weights are appropriately shaped for the number of nodes in my model and the biases have the appropriate lengths for each layer.

The problem is now that I implement the matrix algebra in MatLab and get wildly different predictions from what my_model.predict() yields.

My reconstruction process for a 2 layer MLP (with 11 nodes in the first layer and 10 nodes in the second):

scale()             % elementwise subtract feature mean and divide by feature stdev
scaled_obs = scale(raw_obs)  
% Up to this point results from MatLab == Sklearn

weight1 = [12x11]   % weights to transition from the input layer to the first hidden layer
weight2 = [11x10]
weight3 = [10x1]
bias1 = [11x1]      % bias to add to the first layer after weight1 has been applied
bias2 = [10x1]
bias3 = [1x1]

my_prediction = ((( scaled_obs * w1 + b1') * w2  + b2') * w3  + b3);

I also tried

my_prediction2 = ((( scaled_obs * w1 .* b1') * w2  .* b2') * w3  .* b3);   % because nothing worked...```

for my specific data:

Sklearn prediction = 1.731
my_prediction = -50.347
my_prediction2 = -3.2075

Is there another weight/bias that I am skipping when extracting relevant params from my_model? Is my order of operations in the reconstruction flawed?

Upvotes: 0

Views: 134

Answers (1)

Yash Patel
Yash Patel

Reputation: 88

In my opinion my_prediction = ((( scaled_obs * w1 + b1') * w2 + b2') * w3 + b3); is correct, but there is only 1 missing part and that is activation function. What was the activation function you had passed for the model. By default MLPRegressor have relu as activation function from first layer to third last layer(inclusive). Second last layer doesn't have any activation function. And output layer have a separate activation function which is identity function, basically f(x) = x so you don't have to do anything for that.

If you selected relu or if You didn't at all selected an activation (then relu is default), then you have to do something like this in numpy as np.maximum(0, your_layer1_calculation), I am not sure how this is done in matlab

So final formula would be :

layer1 = np.dot(scaled_inputs, weight0) + bias0
layer2 = np.dot(np.maximum(0, layer1), weight1) + bias1
layer......
layer(n-1) = np.dot(np.maximum(0, layer(n-2), weight(n-1)) + bias(n-1)
layer(n) = layer(n-1) # identity function

Upvotes: 0

Related Questions