Christina
Christina

Reputation: 935

randomly selection of images from File

I have a file that contains a 400 images. What I want is to separate this file into two files: train_images and test_images.

The train_images should contains 150 images selected randomly, and all these images must be different from each other. Then, the test_images should also contains 150 images selected randomly, and should be different from each other, even from the images selected in the file train_images.

I begin by writing a code that aims to select a random number of images from a Faces file and put them on train_images file. I need your help in order to respond to my behavior described above.

clear all;
close all;
clc;


 Train_images='train_faces';
 mkdir(Train_images);


ImageFiles = dir('Faces');
   totalNumberOfImages = length(ImageFiles)-1;
   scrambledList = randperm(totalNumberOfImages);
   numberIWantToUse = 150;
   loop_counter = 1;
   for index = scrambledList(1:numberIWantToUse)
        baseFileName = ImageFiles(index).name;
        str = fullfile('faces', baseFileName); % Better than STRCAT

        face = imread(str);

        imwrite( face, fullfile(Train_images, ['hello' num2str(index) '.jpg']));

        loop_counter = loop_counter + 1;
   end

Any help will be very appreciated.

Upvotes: 0

Views: 952

Answers (2)

lennon310
lennon310

Reputation: 12689

Your code looks good to me. When you implement the test, you can re-run the scrambledList = randperm(totalNumberOfImages); then select the first 150 elements in scrambledList as you did in training process.

You can also directly re-initialize the loop:

for index = scrambledList(numberIWantToUse+1 : 2*numberIWantToUse)
   ... % same thing you wrote in your training loop

end

with this approach, your test sample will be completely different from the training sample.

Upvotes: 1

phyrox
phyrox

Reputation: 2449

Supposing that you have the Bioinformatics Toolbox, you can use crossvalind using the parameter HoldOut:

This is an example. trainand test are logical arrays, so you can use findto get the actual indexes:

ImageFiles = dir('Faces');
ImageFilesIndexes = ones(1,length(ImageFiles )) %Use a numeric array instead the char array
proportion = 150/400; %Testing set
[train,test] = crossvalind('holdout',ImageFilesIndexes,proportion );
training_files = ImageFiles(train); %250 files: It is better to use more data to train
testing_files = ImageFiles(test); %150 files

%Then do whatever you like with the files

Other possibilities are dividerand ( Neural Network Toolbox) and cvpartition (Statistics Toolbox)

Upvotes: 1

Related Questions