Reputation: 121
I am creating a plant disease identification model. I have a dataset of 38 diseases with around 2000 images for each disease. But while training the model, some epochs are getting skipped due to some OUT_OF_RANGE error. Can someone please help me to figure this out?
import os
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Input
train_dir = 'dataset/train'
valid_dir = 'dataset/valid'
batch_size = 32
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
)
valid_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(150, 150),
batch_size=batch_size,
class_mode='categorical'
)
valid_generator = valid_datagen.flow_from_directory(
valid_dir,
target_size=(150, 150),
batch_size=batch_size,
class_mode='categorical'
)
model = Sequential([
Input(shape=(150, 150, 3)),
Conv2D(32, (3, 3), activation='relu'),
MaxPooling2D(2, 2),
Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D(2, 2),
Conv2D(128, (3, 3), activation='relu'),
MaxPooling2D(2, 2),
Flatten(),
Dense(512, activation='relu'),
Dense(38, activation='softmax') # Adjust output units based on the number of disease classes
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(
train_generator,
steps_per_epoch=train_generator.samples // batch_size,
epochs=10,
validation_data=valid_generator,
validation_steps=valid_generator.samples // batch_size
)
model.save('plant_disease_model.h5')
class_indices = train_generator.class_indices
disease_names = list(class_indices.keys())
print("Mapping of Class Indices to Disease Names:", class_indices)
Terminal:
Found 70295 images belonging to 38 classes.
Found 17572 images belonging to 38 classes.
2024-04-23 19:50:32.085744: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instru
ctions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Epoch 1/10
\.venv\Lib\site-packages\keras\src\trainers\data_adapters\py_dataset_adapter.p
y:120: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_m
ultiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored.
self._warn_if_super_not_called()
←[1m2196/2196←[0m ←[32m━━━━━━━━━━━━━━━━━━━━←[0m←[37m←[0m ←[1m905s←[0m 411ms/step - accuracy: 0.4608 - loss: 1.8737 - val_accuracy: 0.7432 - val_
loss: 0.8556
Epoch 2/10
←[1m 1/2196←[0m ←[37m━━━━━━━━━━━━━━━━━━━━←[0m ←[1m12:02←[0m 329ms/step - accuracy: 0.6875 - loss: 0.78202024-04-23 20:05:37.996528: W tensorfl
ow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node IteratorGetNext}}]]
C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\contextlib.py:155: UserWarning: Your input ran out of data; interrupting training. Ma
ke sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches. You may need to use the `.repeat()` function wh
en building your dataset.
self.gen.throw(typ, value, traceback)
2024-04-23 20:05:38.068817: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of
sequence
[[{{node IteratorGetNext}}]]
←[1m2196/2196←[0m ←[32m━━━━━━━━━━━━━━━━━━━━←[0m←[37m←[0m ←[1m0s←[0m 49us/step - accuracy: 0.6875 - loss: 0.7820 - val_accuracy: 0.7500 - val_los
s: 0.2462
As you can see above, epoch 1 was successfully completed but epoch 2 was terminated due to some error. Similarly, epoch 3, 5, 7, 9 were successfully completed but epoch 4, 6, 8, 10 caused errors.
Upvotes: 5
Views: 7753
Reputation: 53
This usually happens you're trying to read the dataset and it is not using .repeat()
and the iterator tries to read beyond the end of the dataset.
This can happen if the number of steps per epoch is set incorrectly.
Make sure to pay attention to your test and validation dataset iterator too!
Calculate it manually:
Example:
Calculation:
As you don't have an integer as a result, you can use math.ceil
for accurate calculation:
This function rounds up to the nearest integer, ensuring all batches are included in training and validation.
Actual code:
history = model.fit(
train_generator,
steps_per_epoch=train_generator.samples // batch_size,
epochs=10,
validation_data=valid_generator,
validation_steps=valid_generator.samples // batch_size
)
New Version:
history = model.fit(
train_generator,
steps_per_epoch=math.ceil(train_generator.samples / batch_size),
epochs=10,
validation_data=valid_generator,
validation_steps=math.ceil(valid_generator.samples / batch_size)
)
Upvotes: 4