How to implement K-Fold Cross validation using Image data generator and using Flow from dataframe (using CSV file)

Question

Please show or explain a dummy example code snippet demonstrating K-Fold Cross Validation with Flow_from_Dataframe, Training_Generator, and Valid_Generator objects for Keras. This is the current code I have (no k-fold only simple fitting ):

ImageDataGen object to perform all the augmentations

IMG_SIZE = (150, 150)
core_idg = ImageDataGenerator(samplewise_center=True, 
                              samplewise_std_normalization=True, 
                              horizontal_flip = True, 
                              vertical_flip = False, 
                              height_shift_range= 0.05, 
                              width_shift_range=0.1, 
                              rotation_range=5, 
                              shear_range = 0.1,
                              fill_mode = 'reflect',
                              zoom_range=0.15)

Split Main Dataframe to train_dataframe and valid_dataframe

train_df, valid_df = train_test_split(main_DF, 
                                   test_size = 0.10, 
                                   random_state = 2018,
                                   stratify = df_large['BINARY'].map(lambda x: x))

creating train_gen and valid_gen using flow_from_dataframe method of ImageDatagen object created before.

"IMAGE_NAMES" and "BINARY" are the columns which consists of Image names and label 0 or 1.

all_labels = [ "0" , "1" ]

train_gen = core_idg.flow_from_dataframe(dataframe=train_df,
                                         directory="./DataFolder/",
                                         x_col = 'IMAGE_NAMES',
                                         y_col = 'BINARY',
                                         class_mode = 'categorical',
                                         classes = all_labels,
                                         target_size = IMG_SIZE,
                                         color_mode = 'rgb',
                                         batch_size = 64)

valid_gen = core_idg.flow_from_dataframe(dataframe=valid_df,
                                         directory="./DataFolder/",
                                         x_col = 'IMAGE_NAMES',
                                         y_col = 'BINARY',
                                         class_mode = 'categorical',
                                         classes = all_labels,
                                         target_size = IMG_SIZE,
                                         color_mode = 'rgb',
                                         batch_size = 256)

test_X, test_Y = next(core_idg.flow_from_dataframe(dataframe=valid_df,
                                         directory="./DataFolder/",
                                         x_col = 'IMAGE_NAMES',
                                         y_col = 'BIN_STR',
                                         class_mode = 'categorical',
                                         classes = all_labels,
                                         target_size = IMG_SIZE,
                                         color_mode = 'rgb',
                                         batch_size = 256))

#fitting
hist = model.fit_generator(train_gen, 
                              validation_data = (test_X, test_Y), 
                              epochs = 30, 
                              callbacks = call_list)

Now how to translate this to K-Fold Cross-validation? according to me core_idg has to be created once outside the K-Fold loop and instead of train_df and valid_df we should use the K-Fold method of index to split. So how can the code snippet I mentioned Can be transformed?

How to implement K-Fold Cross validation using Image data generator and using Flow from dataframe (using CSV file)

Answers (1)

Related Questions