mlghost
mlghost

Reputation: 33

Transfer Learning for Regression in Matlab

I am trying to implement a model that takes an image as the input and gives a vector of 26 numbers. I am using VGG-16 at this time through the following Matlab code:

analyzeNetwork(net);
NUM_OUTPUT = 26;
layers = net.Layers;
%output = fullyConnectedLayer(NUM_OUTPUT, ...
%                             'Name','output_layer', ...
%                             'WeightLearnRateFactor',10, ...
%                             'BiasLearnRateFactor',10);
layers = [
    layers(1:38)
    fullyConnectedLayer(NUM_OUTPUT)
    regressionLayer];

%layers(1:67) = freezeWeights(layers(1:67));
miniBatchSize  = 5;
validationFrequency = floor(numel(YTrain)/miniBatchSize);
options = trainingOptions('sgdm',...
    'InitialLearnRate',0.001, ...
    'ValidationData',{XValidation,YValidation},...
    'Plots','training-progress',...
    'Verbose',false);

net = trainNetwork(XTrain,YTrain,layers,options);
YPred = predict(net,XValidation);
predictionError = YValidation - YPred;
thr = 10;
numCorrect = sum(abs(predictionError) < thr);
numImagesValidation = numel(YValidation);

accuracy = numCorrect/numImagesValidation;
rmse = sqrt(mean(predictionError.^2));

The shape of XTrain and YTrain are as follows:
XTrain: 224 224 3 140
YTrain: 26 140

By running the code above (it is a part of the code not the whole of it) I get the following error:

Error using trainNetwork (line 170) Number of observations in X and Y disagree.

I would appreciate it if somebody could help me to figure out what is the problem because as far as I know the number of samples in both are equal and there is no necessity for the rest of the dimensions to be equal.

Upvotes: 0

Views: 305

Answers (1)

Mendi Barel
Mendi Barel

Reputation: 3687

Transpose YTrain to be 140x26.

Name your new layers, and make them layerGraph

Regression can easly go unstable so decrease learning rate or increase batch size if you get some nans.

net = vgg16 ; % analyzeNetwork(net);
LAYERS_FREEZE_UNTIL=35;
LAYERS_COPY_UNTIL=38;


NUM_TRAIN_SAMPLES = size(YTrain,1);
NUM_OUTPUT = size(YTrain,2);


my_layers =layerGraph([
    freezeWeights(net.Layers(1:LAYERS_FREEZE_UNTIL))
    net.Layers(LAYERS_FREEZE_UNTIL+1:LAYERS_COPY_UNTIL)
    fullyConnectedLayer(NUM_OUTPUT*2,'Name','my_fc1')
    fullyConnectedLayer(NUM_OUTPUT,'Name','my_fc2')
    regressionLayer('Name','my_regr')
    ]);
% figure; plot(my_layers), ylim([0.5,6.5])
% analyzeNetwork(my_layers);

MINI_BATCH_SIZE = 16;

options = trainingOptions('sgdm', ...
    'MiniBatchSize',MINI_BATCH_SIZE, ...
    'MaxEpochs',20, ...
    'InitialLearnRate',1e-4, ...
    'Shuffle','every-epoch', ...
    'ValidationData',{XValidation,YValidation}, ...
    'ValidationFrequency',floor(NUM_TRAIN_SAMPLES/MINI_BATCH_SIZE), ...
    'Verbose',true, ...
    'Plots','training-progress');

my_net = trainNetwork(XTrain,YTrain,my_layers,options);

Upvotes: 0

Related Questions