Reputation: 1227
I'm new to Keras so I get confused between Keras documentation and other people's examples of using fit_generator
. When I test-ran this code with 100 samples (for the sake of speedy output. The actual training sample is more than 10k) in batch size of 32 for 2 epochs:
# Create a generator that generates an image and a label one at a time (because loading all data into memory will freeze the laptop)
def generate_transform(imgs, lbls):
while 1:
for i in range(len(imgs)):
img = np.array(cv2.resize(imgs[i], (224, 224)))
lbl = to_categorical(lbls[i], num_classes=10)
yield (img, lbl)
history = model.fit_generator(generate_transform(x[:100], y[:100]),
steps_per_epoch=100/32,
samples_per_epoch=100,
nb_epoch=2,
validation_data=generate_transform(x_test[:100], y_test[:100]),
validation_steps=100)
# nb_val_samples=100?)
I got this UserWarning:
D:\Users\jason\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel_launcher.py:8: UserWarning: The semantics of the Keras 2 argument `steps_per_epoch` is not the same as the Keras 1 argument `samples_per_epoch`. `steps_per_epoch` is the number of batches to draw from the generator at each epoch. Basically steps_per_epoch = samples_per_epoch/batch_size. Similarly `nb_val_samples`->`validation_steps` and `val_samples`->`steps` arguments have changed. Update your method calls accordingly.
D:\Users\jason\AppData\Local\Continuum\Anaconda3\lib\site-packages\ipykernel_launcher.py:8: UserWarning: Update your `fit_generator` call to the Keras 2 API: `fit_generator(<generator..., steps_per_epoch=100, validation_data=<generator..., validation_steps=100, epochs=2)`
And the output looked like this:
Epoch 1/2
100/100 [==============================] - 84s 836ms/step - loss: 3.0745 - acc: 0.4500 - val_loss: 2.3886 - val_acc: 0.0300
Epoch 2/2
100/100 [==============================] - 86s 864ms/step - loss: 0.3654 - acc: 0.9000 - val_loss: 2.4644 - val_acc: 0.0900
My questions are:
Was my call correct with those arguments and their supplied values?
Was my model trained with 32 images and labels at each step; and it was trained with 100/32 steps per epoch?
Am I required to use the argument steps_per_epoch
?
Which argument should I use: validation_steps
or nb_val_samples
?
Did my model validate all 100 samples of the validation generator (as indicated by x_test[:100]
) for 100 times (as indicated by validation_steps=100
) or it's only validating 100 times of one sample each (because validation generator only yield one sample at a time)? Why didn't the output show the number of steps?
Did my model use the trained weight from the first epoch to re-train the same data again, that's why the training accuracy jumped from 0.45 in the first epoch to 0.9 in the second epoch?
Could you please help me with the above questions?
Thanks in advance.
Upvotes: 0
Views: 826
Reputation: 1313
I ran into this problem & solved it in my code below {before in Keras 1.1.2 ==> after in Keras 2.2.4}:
294 # Old Keras==1.1.2 fit_generator
295 # history = model.fit_generator(
296 # train_data_generator.get_data(),
297 # samples_per_epoch=train_data_generator.get_num_files(),
298 # nb_epoch=config["num_epochs"],
300 # verbose=1,
301 # validation_data=validation_data_generator.get_data(should_shuffle=False),
302 # nb_val_samples=validation_data_generator.get_num_files(),
303 # nb_worker=2,
304 # max_q_size=batch_size,
305 # pickle_safe=True)
306
307 # New working! Keras 2.2.4 fit_generator
309 history = model.fit_generator(
310 train_data_generator.get_data(),
312 verbose=1,
313 validation_data=validation_data_generator.get_data(should_shuffle=False),
314 steps_per_epoch=train_data_generator.get_num_files() // batch_size,
315 epochs=config["num_epochs"],
316 validation_steps=validation_data_generator.get_num_files() // batch_size,
317 workers=2, use_multiprocessing=True,
318 max_queue_size=batch_size)
Looking at your code, you need just steps_per_epoch
not samples_per_epoch
, change nb_epoch
to epochs
. I don't fully understand your code or the training/validation setup (100 train & validation samples?) and it's best to ask one question per post, but I'll take a stab at fixing your code (untested of course):
Keep in mind that number_of_steps == number_of_samples // batch_size
and if 100 is num_training_samples
, you'll have to have a pretty small batch_size
for number_of_steps
to make sense:
history = model.fit_generator(
generate_transform(x[:100], y[:100]), # training data generator
verbose=1,
val_data=generate_transform(x_test[:100], y_test[:100]), # validation data generator
steps_per_epoch=100 // batch_size, # 100 is num_training_samples, divided by batch_size == steps_per_epoch
epochs=2,
val_steps=100 // batch_size # 100 is num_val_samples, divided by batch_size == val_steps
)
Upvotes: 1