manoelpqueiroz
manoelpqueiroz

Reputation: 637

How to use cross-validation with keras image datasets from directories?

I have an image dataset in keras which I loaded separately between train and test directly from the respective function:

from tensorflow import keras

tds = keras.preprocessing\
    .image_dataset_from_directory('dataset_folder', seed=123,
                                  validation_split=0.35, subset='training')

vds = keras.preprocessing\
    .image_dataset_from_directory('dataset_folder', seed=123,
                                  validation_split=0.35, subset='validation')

Then I go through the usual phases of my neural network:

from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

num_classes = 5

model = Sequential([
    layers.experimental.preprocessing.Rescaling(1.0/255,
                                                input_shape=(256, 256, 3)),
    layers.Conv2D(16, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(32, 3, padding='same', activation='relu'),
    layers.MaxPooling2D(),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(num_classes)])

model\
    .compile(optimizer='adam', metrics=['accuracy'],
             loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True))

hist = model.fit(tds, validation_data=vds, epochs=15)

How can I implement a cross-validation using either KFold or StratifiedKFold within sklearn.model_selection? If in order to be able to do that I have to change how the data is loaded, I'll also be glad to know how to do it.

Upvotes: 2

Views: 3446

Answers (1)

Abhilash Rajan
Abhilash Rajan

Reputation: 349

Have a look at these suggestions for implementing cross validation in Keras:

Cross Validation in Keras

https://machinelearningmastery.com/evaluate-performance-deep-learning-models-keras/

Loading the data using image_dataset_from_directory will generate a tf.data.dataset object which I am not sure will help in the above implementation. One alternative is to convert the images into Numpy arrays which can then be processed by K-fold. For that you can refer to the following:

How to convert a folder of images into X and Y batches with Keras?

Note: The following statement is mentioned in the machine learning mastery link given above:

Cross validation is often not used for evaluating deep learning models because of the greater computational expense. For example k-fold cross validation is often used with 5 or 10 folds. As such, 5 or 10 models must be constructed and evaluated, greatly adding to the evaluation time of a model.

Upvotes: 1

Related Questions