Reputation: 69
I made a Colab pro purchase to train my CNN model and when I try to train the model with only 4k images, the training gets starts instantly...but when I try to train the model with 30k images,the training won't even starts. .i even waited for 1 hour but the model never get starts training it got stuck at first epoch itself.......there is nothing wrong with my code i double checked it....image shows where i got stuck even at first epoch of my model training with 30k images
Upvotes: 1
Views: 1289
Reputation: 1117
When you want to fit a model on a lot of images, you can't pass the entire database of images, you have to use a generator that passes to the model only the current batch, consider writing one or using TFRecords. Here is a good example on google lab:
https://codelabs.developers.google.com/codelabs/keras-flowers-tpu/#4
Upvotes: 0
Reputation: 69
I fixed that large dataset issue using this generator...code below I used for it
train_datagen = ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
validation_split=0.2) # set validation split
train_generator = train_datagen.flow_from_directory(
data_dir,
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='training') # set as training data
validation_generator = train_datagen.flow_from_directory(
data_dir, # same directory as training data
target_size=(img_height, img_width),
batch_size=batch_size,
class_mode='categorical',
subset='validation') # set as validation data
history=model.fit_generator(
train_generator,
steps_per_epoch = train_generator.samples // batch_size,
validation_data = validation_generator,
validation_steps = validation_generator.samples // batch_size,
epochs = epochs)
Upvotes: 1
Reputation: 77
Batch-wise processing is a good idea for training large datasets. The delay is due to large amount of data in your dataset. Below is a good example to do that. (NOTE that how steps per epoch divide the data). Select batch size carefully according to your data.
batch_size = 1000
history = model.fit_generator(
train_generator,
steps_per_epoch=train_generator.samples//train_generator.batch_size,
epochs=5,
validation_data=validation_generator, validation_steps=validation_generator.samples//validation_generator.batch_size,
verbose=1)
Upvotes: 0