While training DL model, local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence

Question

I am creating a plant disease identification model. I have a dataset of 38 diseases with around 2000 images for each disease. But while training the model, some epochs are getting skipped due to some OUT_OF_RANGE error. Can someone please help me to figure this out?

import os
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Input

train_dir = 'dataset/train'
valid_dir = 'dataset/valid'
batch_size = 32

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

valid_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(150, 150),
    batch_size=batch_size,
    class_mode='categorical'
)

valid_generator = valid_datagen.flow_from_directory(
    valid_dir,
    target_size=(150, 150),
    batch_size=batch_size,
    class_mode='categorical'
)

model = Sequential([
    Input(shape=(150, 150, 3)),
    Conv2D(32, (3, 3), activation='relu'),
    MaxPooling2D(2, 2),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D(2, 2),
    Conv2D(128, (3, 3), activation='relu'),
    MaxPooling2D(2, 2),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(38, activation='softmax')  # Adjust output units based on the number of disease classes
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(
    train_generator,
    steps_per_epoch=train_generator.samples // batch_size,
    epochs=10,
    validation_data=valid_generator,
    validation_steps=valid_generator.samples // batch_size
)

model.save('plant_disease_model.h5')

class_indices = train_generator.class_indices
disease_names = list(class_indices.keys())
print("Mapping of Class Indices to Disease Names:", class_indices)

Terminal:

Found 70295 images belonging to 38 classes.
Found 17572 images belonging to 38 classes.
2024-04-23 19:50:32.085744: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instru
ctions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Epoch 1/10
\.venv\Lib\site-packages\keras\src	rainers\data_adapters\py_dataset_adapter.p
y:120: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_m
ultiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored.
  self._warn_if_super_not_called()
←[1m2196/2196←[0m ←[32m━━━━━━━━━━━━━━━━━━━━←[0m←[37m←[0m ←[1m905s←[0m 411ms/step - accuracy: 0.4608 - loss: 1.8737 - val_accuracy: 0.7432 - val_
loss: 0.8556
Epoch 2/10
←[1m   1/2196←[0m ←[37m━━━━━━━━━━━━━━━━━━━━←[0m ←[1m12:02←[0m 329ms/step - accuracy: 0.6875 - loss: 0.78202024-04-23 20:05:37.996528: W tensorfl
ow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
         [[{{node IteratorGetNext}}]]
C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\contextlib.py:155: UserWarning: Your input ran out of data; interrupting training. Ma
ke sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches. You may need to use the `.repeat()` function wh
en building your dataset.
  self.gen.throw(typ, value, traceback)
2024-04-23 20:05:38.068817: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of
sequence
         [[{{node IteratorGetNext}}]]
←[1m2196/2196←[0m ←[32m━━━━━━━━━━━━━━━━━━━━←[0m←[37m←[0m ←[1m0s←[0m 49us/step - accuracy: 0.6875 - loss: 0.7820 - val_accuracy: 0.7500 - val_los
s: 0.2462

As you can see above, epoch 1 was successfully completed but epoch 2 was terminated due to some error. Similarly, epoch 3, 5, 7, 9 were successfully completed but epoch 4, 6, 8, 10 caused errors.

While training DL model, local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence

Answers (1)

Related Questions