Vito Gentile
Vito Gentile

Reputation: 14386

Export a neural network trained with MATLAB in other programming languages

I trained a neural network using the MATLAB Neural Network Toolbox, and in particular using the command nprtool, which provides a simple GUI to use the toolbox features, and to export a net object containing the informations about the NN generated.

In this way, I created a working neural network, that I can use as classifier, and a diagram representing it is the following:

Diagram representing the Neural Network

There are 200 inputs, 20 neurons in the first hidden layer, and 2 neurons in the last layer that provide a bidimensional output.

What I want to do is to use the network in some other programming language (C#, Java, ...).

In order to solve this problem, I try to use the following code in MATLAB:

y1 = tansig(net.IW{1} * input + net.b{1});
Results = tansig(net.LW{2} * y1 + net.b{2});

Assuming that input is a monodimensional array of 200 elements, the previous code would work if net.IW{1} is a 20x200 matrix (20 neurons, 200 weights).

The problem is that I noticed that size(net.IW{1}) returns unexpected values:

>> size(net.IW{1})

    ans =

    20   199

I got the same problem with a network with 10000 input. In this case, the result wasn't 20x10000, but something like 20x9384 (I don't remember the exact value).

So, the question is: how can I obtain the weights of each neuron? And after that, can someone explain me how can I use them to produce the same output of MATLAB?

Upvotes: 15

Views: 23457

Answers (5)

Gabrer
Gabrer

Reputation: 475

This is a small improvement to the great Vito Gentile's answer.

If you want to use the preprocessing and postprocessing 'mapminmax' functions, you have to pay attention because 'mapminmax' in Matlab normalizes by ROW and not by column!

This is what you need to add to the upper "classify" function, to keep a coherent pre/post-processing:

[m n] = size(input);
ymax = 1;
ymin = -1;
for i=1:m
   xmax = max(input(i,:));
   xmin = min(input(i,:));
   for j=1:n
     input(i,j) = (ymax-ymin)*(input(i,j)-xmin)/(xmax-xmin) + ymin;
   end
end

And this at the end of the function:

ymax = 1;
ymin = 0;
xmax = 1;
xmin = -1;
Results = (ymax-ymin)*(Results-xmin)/(xmax-xmin) + ymin;

This is Matlab code, but it can be easily read as pseudocode. Hope this will be helpful!

Upvotes: 3

fermat4214
fermat4214

Reputation: 101

Hence the solution becomes (after correcting all parts)

Here I am giving a solution in Matlab, but if you have tanh() function, you may easily convert it to any programming language. It is for just showing the fields from network object and the operations you need.

  • Assume you have a trained ann (network object) that you want to export
  • Assume that the name of the trained ann is trained_ann

Here is the script for exporting and testing. Testing script compares original network result with my_ann_evaluation() result

% Export IT
exported_ann_structure = my_ann_exporter(trained_ann);

% Run and Compare 
% Works only for single INPUT vector
% Please extend it to MATRIX version by yourself
input = [12 3 5 100];
res1 = trained_ann(input')';
res2 = my_ann_evaluation(exported_ann_structure, input')';

where you need the following two functions

First my_ann_exporter:

function [ my_ann_structure ] = my_ann_exporter(trained_netw)
% Just for extracting as Structure object
my_ann_structure.input_ymax = trained_netw.inputs{1}.processSettings{1}.ymax;
my_ann_structure.input_ymin = trained_netw.inputs{1}.processSettings{1}.ymin;
my_ann_structure.input_xmax = trained_netw.inputs{1}.processSettings{1}.xmax;
my_ann_structure.input_xmin = trained_netw.inputs{1}.processSettings{1}.xmin;

my_ann_structure.IW = trained_netw.IW{1};
my_ann_structure.b1 = trained_netw.b{1};
my_ann_structure.LW = trained_netw.LW{2};
my_ann_structure.b2 = trained_netw.b{2};

my_ann_structure.output_ymax = trained_netw.outputs{2}.processSettings{1}.ymax;
my_ann_structure.output_ymin = trained_netw.outputs{2}.processSettings{1}.ymin;
my_ann_structure.output_xmax = trained_netw.outputs{2}.processSettings{1}.xmax;
my_ann_structure.output_xmin = trained_netw.outputs{2}.processSettings{1}.xmin;
end

Second my_ann_evaluation:

function [ res ] = my_ann_evaluation(my_ann_structure, input)
% Works with only single INPUT vector
% Matrix version can be implemented

ymax = my_ann_structure.input_ymax;
ymin = my_ann_structure.input_ymin;
xmax = my_ann_structure.input_xmax;
xmin = my_ann_structure.input_xmin;
input_preprocessed = (ymax-ymin) * (input-xmin) ./ (xmax-xmin) + ymin;

% Pass it through the ANN matrix multiplication
y1 = tanh(my_ann_structure.IW * input_preprocessed + my_ann_structure.b1);

y2 = my_ann_structure.LW * y1 + my_ann_structure.b2;

ymax = my_ann_structure.output_ymax;
ymin = my_ann_structure.output_ymin;
xmax = my_ann_structure.output_xmax;
xmin = my_ann_structure.output_xmin;
res = (y2-ymin) .* (xmax-xmin) /(ymax-ymin) + xmin;
end

Upvotes: 0

beniroquai
beniroquai

Reputation: 166

I tried to implement a simply 2-layer NN in C++ using OpenCV and then exported the weights to Android which worked quiet well. I wrote a small script which generates a header file with the learned weights and this is used in the following code snipped.

// Map Minimum and Maximum Input Processing Function
Mat mapminmax_apply(Mat x, Mat settings_gain, Mat settings_xoffset, double settings_ymin){

    Mat y;

    subtract(x, settings_xoffset, y);
    multiply(y, settings_gain, y);
    add(y, settings_ymin, y);

    return y;


    /* MATLAB CODE
     y = x - settings_xoffset;
     y = y .* settings_gain;
     y = y + settings_ymin;
     */
}




// Sigmoid Symmetric Transfer Function
Mat transig_apply(Mat n){
    Mat tempexp;
    exp(-2*n, tempexp);
    Mat transig_apply_result = 2 /(1 + tempexp) - 1;
    return transig_apply_result;
}


// Map Minimum and Maximum Output Reverse-Processing Function
Mat mapminmax_reverse(Mat y, Mat settings_gain, Mat settings_xoffset, double settings_ymin){

    Mat x;

    subtract(y, settings_ymin, x);
    divide(x, settings_gain, x);
    add(x, settings_xoffset, x);

    return x;


/* MATLAB CODE
function x = mapminmax_reverse(y,settings_gain,settings_xoffset,settings_ymin)
x = y - settings_ymin;
x = x ./ settings_gain;
x = x + settings_xoffset;
end
*/

}


Mat getNNParameter (Mat x1)
{

    // convert double array to MAT

    // input 1
    Mat x1_step1_xoffsetM = Mat(1, 48, CV_64FC1, x1_step1_xoffset).t();
    Mat x1_step1_gainM = Mat(1, 48, CV_64FC1, x1_step1_gain).t();
    double x1_step1_ymin = -1;

    // Layer 1
    Mat b1M = Mat(1, 25, CV_64FC1, b1).t();
    Mat IW1_1M = Mat(48, 25, CV_64FC1, IW1_1).t();

    // Layer 2
    Mat b2M = Mat(1, 48, CV_64FC1, b2).t();
    Mat LW2_1M = Mat(25, 48, CV_64FC1, LW2_1).t();

    // input 1
    Mat y1_step1_gainM = Mat(1, 48, CV_64FC1, y1_step1_gain).t();
    Mat y1_step1_xoffsetM = Mat(1, 48, CV_64FC1, y1_step1_xoffset).t();
    double y1_step1_ymin = -1;



    // ===== SIMULATION ========


    // Input 1
    Mat xp1 = mapminmax_apply(x1, x1_step1_gainM, x1_step1_xoffsetM, x1_step1_ymin);

    Mat  temp = b1M + IW1_1M*xp1;

    // Layer 1
    Mat a1M = transig_apply(temp);

    // Layer 2
    Mat a2M = b2M + LW2_1M*a1M;

    // Output 1
    Mat y1M = mapminmax_reverse(a2M, y1_step1_gainM, y1_step1_xoffsetM, y1_step1_ymin);

    return y1M;
}

example for a bias in the header could be this:

static double b2[1][48] = {
        {-0.19879, 0.78254, -0.87674, -0.5827, -0.017464, 0.13143, -0.74361, 0.4645, 0.25262, 0.54249, -0.22292, -0.35605, -0.42747, 0.044744, -0.14827, -0.27354, 0.77793, -0.4511, 0.059346, 0.29589, -0.65137, -0.51788, 0.38366, -0.030243, -0.57632, 0.76785, -0.36374, 0.19446, 0.10383, -0.57989, -0.82931, 0.15301, -0.89212, -0.17296, -0.16356, 0.18946, -1.0032, 0.48846, -0.78148, 0.66608, 0.14946, 0.1972, -0.93501, 0.42523, -0.37773, -0.068266, -0.27003, 0.1196}};

Now, that Google published Tensorflow, this became obsolete.

Upvotes: 0

Vito Gentile
Vito Gentile

Reputation: 14386

I solved the problems described above, and I think it is useful to share what I've learned.

Premises

First of all, we need some definitions. Let's consider the following image, taken from [1]:

A scheme of Neural Network

In the above figure, IW stands for initial weights: they represent the weights of neurons on the Layer 1, each of which is connected with each input, as the following image shows [1]:

All neurons are connected with all inputs

All the other weights, are called layer weights (LW in the first figure), that are also connected with each output of the previous layer. In our case of study, we use a network with only two layers, so we will use only one LW array to solve our problems.

Solution of the problem

After the above introduction, we can proceed by dividing the issue in two steps:

  • Force the number of initial weights to match with the input array length
  • Use the weights to implement and use the neural network just trained in other programming languages

A - Force the number of initial weights to match with the input array length

Using the nprtool, we can train our network, and at the end of the process, we can also export in the workspace some information about the entire training process. In particular, we need to export:

  • a MATLAB network object that represents the neural network created
  • the input array used to train the network
  • the target array used to train the network

Also, we need to generate a M-file that contains the code used by MATLAB to create the neural network, because we need to modify it and change some training options.

The following image shows how to perform these operations:

The nprtool GUI to export data and generate the M-code

The M-code generated will be similar to the following one:

function net = create_pr_net(inputs,targets)
%CREATE_PR_NET Creates and trains a pattern recognition neural network.
%
%  NET = CREATE_PR_NET(INPUTS,TARGETS) takes these arguments:
%    INPUTS - RxQ matrix of Q R-element input samples
%    TARGETS - SxQ matrix of Q S-element associated target samples, where
%      each column contains a single 1, with all other elements set to 0.
%  and returns these results:
%    NET - The trained neural network
%
%  For example, to solve the Iris dataset problem with this function:
%
%    load iris_dataset
%    net = create_pr_net(irisInputs,irisTargets);
%    irisOutputs = sim(net,irisInputs);
%
%  To reproduce the results you obtained in NPRTOOL:
%
%    net = create_pr_net(trainingSetInput,trainingSetOutput);

% Create Network
numHiddenNeurons = 20;  % Adjust as desired
net = newpr(inputs,targets,numHiddenNeurons);
net.divideParam.trainRatio = 75/100;  % Adjust as desired
net.divideParam.valRatio = 15/100;  % Adjust as desired
net.divideParam.testRatio = 10/100;  % Adjust as desired

% Train and Apply Network
[net,tr] = train(net,inputs,targets);
outputs = sim(net,inputs);

% Plot
plotperf(tr)
plotconfusion(targets,outputs)

Before start the training process, we need to remove all preprocessing and postprocessing functions that MATLAB executes on inputs and outputs. This can be done adding the following lines just before the % Train and Apply Network lines:

net.inputs{1}.processFcns = {};
net.outputs{2}.processFcns = {};

After these changes to the create_pr_net() function, simply we can use it to create our final neural network:

net = create_pr_net(input, target);

where input and target are the values we exported through nprtool.

In this way, we are sure that the number of weights is equal to the length of input array. Also, this process is useful in order to simplify the porting to other programming languages.

B - Implement and use the neural network just trained in other programming languages

With these changes, we can define a function like this:

function [ Results ] = classify( net, input )
    y1 = tansig(net.IW{1} * input + net.b{1});

    Results = tansig(net.LW{2} * y1 + net.b{2});
end

In this code, we use the IW and LW arrays mentioned above, but also the biases b, used in the network schema by the nprtool. In this context, we don't care about the role of biases; simply, we need to use them because nprtool does it.

Now, we can use the classify() function defined above, or the sim() function equally, obtaining the same results, as shown in the following example:

>> sim(net, input(:, 1))

ans =

    0.9759
   -0.1867
   -0.1891

>> classify(net, input(:, 1))

ans =

   0.9759   
  -0.1867
  -0.1891

Obviously, the classify() function can be interpreted as a pseudocode, and then implemented in every programming languages in which is possibile to define the MATLAB tansig() function [2] and the basic operations between arrays.

References

[1] Howard Demuth, Mark Beale, Martin Hagan: Neural Network Toolbox 6 - User Guide, MATLAB

[2] Mathworks, tansig - Hyperbolic tangent sigmoid transfer function, MATLAB Documentation center

Additional notes

Take a look to the robott's answer and the Sangeun Chi's answer for more details.

Upvotes: 14

Sangeun Chi
Sangeun Chi

Reputation: 31

Thanks to VitoShadow and robott answers, I can export Matlab neural network values to other applications.

I really appreciate them, but I found some trivial errors in their codes and want to correct them.

1) In the VitoShadow codes,

Results = tansig(net.LW{2} * y1 + net.b{2});
-> Results = net.LW{2} * y1 + net.b{2};

2) In the robott preprocessing codes, It would be easier extracting xmax and xmin from the net variable than calculating them.

xmax = net.inputs{1}.processSettings{1}.xmax
xmin = net.inputs{1}.processSettings{1}.xmin

3) In the robott postprocessing codes,

xmax = net.outputs{2}.processSettings{1}.xmax
xmin = net.outputs{2}.processSettings{1}.xmin

Results = (ymax-ymin)*(Results-xmin)/(xmax-xmin) + ymin;
-> Results = (Results-ymin)*(xmax-xmin)/(ymax-ymin) + xmin;

You can manually check and confirm the values as follows:

p2 = mapminmax('apply', net(:, 1), net.inputs{1}.processSettings{1})

-> preprocessed data

y1 = purelin ( net.LW{2} * tansig(net.iw{1}* p2 + net.b{1}) + net.b{2})

-> Neural Network processed data

y2 = mapminmax( 'reverse' , y1, net.outputs{2}.processSettings{1})

-> postprocessed data

Reference: http://www.mathworks.com/matlabcentral/answers/14517-processing-of-i-p-data

Upvotes: 3

Related Questions