Reputation: 2160
It seems like I could get the exact same result by making num_samples bigger and keeping nb_epoch=1. I thought the purpose of multiple epochs was to iterate over the same data multiple times, but Keras doesn't reinstantiate the generator at the end of each epoch. It just keeps going. For example training this autoencoder:
import numpy as np
from keras.layers import (Convolution2D, MaxPooling2D,
UpSampling2D, Activation)
from keras.models import Sequential
rand_imgs = [np.random.rand(1, 100, 100, 3) for _ in range(1000)]
def keras_generator():
i = 0
while True:
print(i)
rand_img = rand_imgs[i]
i += 1
yield (rand_img, rand_img)
layers = ([
Convolution2D(20, 5, 5, border_mode='same',
input_shape=(100, 100, 3), activation='relu'),
MaxPooling2D((2, 2), border_mode='same'),
Convolution2D(3, 5, 5, border_mode='same', activation='relu'),
UpSampling2D((2, 2)),
Convolution2D(3, 5, 5, border_mode='same', activation='relu')])
autoencoder = Sequential()
for layer in layers:
autoencoder.add(layer)
gen = keras_generator()
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
history = autoencoder.fit_generator(gen, samples_per_epoch=100, nb_epoch=2)
It seems like I get the same result with (samples_per_epoch=100, nb_epoch=2) as I do for (samples_per_epoch=200, nb_epoch=1). Am I using fit_generator as intended?
Upvotes: 4
Views: 2542
Reputation: 40506
Yes - you are right that when using keras.fit_generator
these two approaches are equivalent. But - there are variety of reasons why keeping epochs
is reasonable:
epoch
comprises the amount of data after which you want to log some important statistics about training (like e.g. time or loss at the end of the epoch).batch_size
and nb_epoch
to such values that epoch would comprise going through every example in your dataset.flow
generator - in this case, when you have e.g. a set of pictures loaded to your Python
and you want to use Keras.ImageDataGenerator
to apply different kind of data transformations, setting batch_size
and nb_epoch
in such way that epoch comprises going through every example in your dataset might help you in keeping track of a progress of your trainning process.Upvotes: 4