Reputation: 11
I am trying to fit a CNN model (AlexNet architecture) with a 4900 images(480*640*3) dataset and I would like to do Data Augmentation, I have done a custom generator which use ImageDataGenerator method, because the images are on different paths and the labels too, so I have done a class who take all paths and save on two lists the images paths and its labels, then it loads on batches of 32 images and labels and fit the image data generator:
This is the method of the custom generator called from the model when it´s fit, and is where I fit the ImageDataGenerator
def __getitem__(self,index) :
batch_x=self.img_filenames[index * self.batch_size : (index+1) * self.batch_size]
batch_y=self.labels[index * self.batch_size: (index+1) * self.batch_size]
gen=ImageDataGenerator(rescale=1./255,
rotation_range=90,
brightness_range=(0.1,0.9),
horizontal_flip=True)
X=[plt.imread(filename) for filename in batch_x]
X,Y = next(gen.flow(x= np.array(X), y= np.array(batch_y), batch_size=self.batch_size))
return X,Y
I have some questions:
What is supposed that ImageDataGenerator returns, if I pass 32(batch_size) differents images, it returns 32 modified images, 1 for each one, or 32 images for each one, and if I only pass 1 image with a batch size of 32, it returns 32 modified images from that one? I'm almost sure that are 1 for each one but I want to confirm.
Secondly, if I want to have 40k images, if I change the index to 0 again when it exceed samples//batch_size,and change the len method multiplying by 2 or whatever I want, it is supposed that as the images are generated randomly, I will have 4900 new images or as much as I want isn´t it?
The main problem is that when it reach 0.5 accuracy it stops increasing, I have tried with 3 epochs and it is the same, it increase till 3 or 4 batches and then stops, so that is why my doubts.
Thanks you.
Upvotes: 1
Views: 2395
Reputation: 11
I printed X.shape and seems to be the 32 images but modified, so It doesn´t multiply the images. And the method to augment the data which I said works fine too.
Upvotes: 0
Reputation: 2959
Let me try to answer
1. If you pass batch size 32 to ImageDataGenerator
with horizontal_flip=True
only, it flip all of 32 images horizontally and passes these 32 +32 (original + flipped) for training.
If you set horizontal_flip
and vertical_flip
, then 32+32+32 images will be passed for training.
For brightness_range
it produces one image for each brightness scale corresponding to one original image. It means if your brightness scale is 0.1-0.5
, then 32*5
images were produced.
I am not sure about the second question. A better choice is to do more data augmentation both on training and test data.
For third question, you should try efficient net
with focal loss
Upvotes: 2