
Reputation: 67

Python model.fit_generator gets stuck on first epoch and tries to compute "unknown" number of steps

I am very new to python and image classification models but I have some tensorflow code which has worked fine until recently. When I get to this part of the code I run into a problem all of a sudden. I am running through google colab notebooks.

epochs = 5

history = model.fit_generator(train_generator, 

The fit_generator is unable to compute a number of steps per epoch and lists it as unknown. The first epoch then just continues without stopping with the accuracy slowly ticking up towards 1 if I leave it long enough.

Epoch 1/5
    325/Unknown - 992s 3s/step - loss: 0.2221 - accuracy: 0.9318

Does anyone have ideas what would cause it to have an Unknown number of steps per epoch and never get past epoch 1?

Here are is some more information from the code that may be relevant (training size is 1602 and test 395 with 11 different classes):

Found 1602 images belonging to 11 classes.
Found 395 images belonging to 11 classes.

batch size was set to 64

for image_batch, label_batch in train_generator:
image_batch.shape, label_batch.shape
((64, 224, 224, 3), (64, 11))

# Create the base model from the pre-trained model MobileNet V2
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
base_model.trainable = False
model = tf.keras.Sequential([
  tf.keras.layers.Conv2D(32, 3, activation='relu'),
  tf.keras.layers.Dense(11, activation='softmax')

model summary

Model: "sequential_2"
Layer (type)                 Output Shape              Param #   
mobilenetv2_1.00_224 (Model) (None, 7, 7, 1280)        2257984   
conv2d_2 (Conv2D)            (None, 5, 5, 32)          368672    
dropout_2 (Dropout)          (None, 5, 5, 32)          0         
global_average_pooling2d_2 ( (None, 32)                0         
dense_2 (Dense)              (None, 11)                363       
Total params: 2,627,019
Trainable params: 369,035
Non-trainable params: 2,257,984

Upvotes: 0

Views: 1648

Answers (2)

Anmol Deep
Anmol Deep

Reputation: 683

It could be an issue related to multiprocessing. You can also try setting workers=1 and use_multiprocessing=False. It worked for me.

Upvotes: 0

Bashir Kazimi
Bashir Kazimi

Reputation: 1377

you should pass steps_per_epoch and validation_steps parameters to your fit_generator function to let the model know how many batches there are for training and validation sets.

The values for those parameters are usually number of examples divided by the batch size. In this case:

steps_per_epoch = 1602//64
validation_steps = 395//64


                validation_data=val_generator,steps_per_epoch=1602//64, validation_steps=395//64)

Upvotes: 1

Related Questions