Mars
Mars

Reputation: 73

How to Use KFold Cross Validation Output as CNN Input for Image Processing?

I'm trying to use Convolutional Neural Network (CNN) for image classification. And I want to use KFold Cross Validation for data train and test. I'm new for this and I don't really understand how to do it.

I've tried KFold Cross Validation and CNN in separate code. And I don't know how to combine it.

I'm using iris_data.csv with 3 classes as the example for input.

import pandas as pd
import numpy as np
from sklearn.model_selection import KFold
from sklearn.preprocessing import MinMaxScaler
from sklearn.svm import SVR

dataset = pd.read_csv('iris_data.csv')

x = dataset.iloc[:,0:3]
y = dataset.iloc[:, 4]

scaler = MinMaxScaler(feature_range=(0, 1))
x = scaler.fit_transform(x)

cv = KFold(n_splits=10, shuffle=False)
for train_index, test_index in cv.split(x):
    print("Train Index: ", train_index, "\n")
    print("Test Index: ", test_index)

    x_train, x_test, y_train, y_test = x[train_index], x[test_index], y[train_index], y[test_index]

And here the CNN code example.

import numpy as np
import tensorflow as tf
from keras.models import Model
from keras.layers import Input, Activation, Dense, Conv2D, MaxPooling2D, Flatten
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import Adam
from keras.callbacks import TensorBoard

# Images Dimensions
img_width, img_height = 200, 200

# Data Path
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'

# Parameters
nb_train_samples = 100
nb_validation_samples = 50
epochs = 50
batch_size = 10

# TensorBoard Callbacks
callbacks = TensorBoard(log_dir='./Graph')

# Training Data Augmentation
train_datagen = ImageDataGenerator(
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

# Rescale Testing Data
test_datagen = ImageDataGenerator(rescale=1. / 255)

# Train Data Generator
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='categorical')

# Testing Data Generator
validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='categorical')

# Feature Extraction Layer KorNet
inputs = Input(shape=(img_width, img_height, 3))
conv_layer = Conv2D(16, (5, 5), strides=(3,3), activation='relu')(inputs) 
conv_layer = MaxPooling2D((2, 2))(conv_layer) 
conv_layer = Conv2D(32, (5, 5), strides=(3,3), activation='relu')(conv_layer) 
conv_layer = MaxPooling2D((2, 2))(conv_layer) 

# Flatten Layer
flatten = Flatten()(conv_layer) 

# Fully Connected Layer
fc_layer = Dense(32, activation='relu')(flatten)
outputs = Dense(3, activation='softmax')(fc_layer)

model = Model(inputs=inputs, outputs=outputs)

# Adam Optimizer and Cross Entropy Loss
adam = Adam(lr=0.0001)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])

# Print Model Summary
print(model.summary())

model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size, 
    callbacks=[callbacks])

model.save('./models/model.h5')
model.save_weights('./models/weights.h5')

I want the result of KFold Cross Validation used as the training and testing data for CNN.

Upvotes: 6

Views: 9249

Answers (2)

Zeina
Zeina

Reputation: 64

Ok you have to understand how K-fold method works first, and I will assume you do, if you don't you can review this resource to get a deeper understanding:

https://vitalflux.com/k-fold-cross-validation-python-example/

So according to your dataset you need to divide your dataset into k folds, I recommend you either look at previous notebooks and see how many folds they had divided their dataset into, or if the dataset is small use a large number of folds and vice versa.

Last step before applying K-fold is to split your dataset into X & Y, then :

X = np_image_list
y = image_labels

At this part you may initialize a list to calculate the average of each parameter like: Accuracy, Recall..

Train_accuracy = []
Test_accuracy = []


from sklearn.model_selection import KFold
k =10
kf = KFold(n_splits=k, random_state=True, shuffle=True)

for train_index, test_index in kf.split(X) :
    X_train, X_test = X[train_index], X[test_index]
    y_train, y_test = y[train_index], y[test_index]
    #Your Entire model
    #then your evaluation to the model "Accuracy, recall..etc"
    #append the calculated accuracy to your list
    t_accuracy = accuracy_score(y_train,Model_pred_train)*100
    Train_accuracy.append(train_accuacy)

Finally take the average: if number of folds say k= 10, then

print("Average Train Accuracy = "+ str(Train_accuracy/k))

You can look at my code I created a tutorial here for applying K-fold method on image classification model. https://github.com/ZienabEsam/Image-Classifcation-using-K-fold-method

using AlexNet, that is a CNN.

Upvotes: 0

christk
christk

Reputation: 872

just do something like this

from keras.models import Sequential
from sklearn.model_selection import KFold
import numpy

dataset = numpy.loadtxt("iris_data.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:3]
Y = dataset[:,4]
# define 10-fold cross validation test harness
kfold = KFold(n_splits=10, shuffle=True, random_state=seed)
cvscores = []
for train, test in kfold.split(X, Y):
  # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation='relu'))
    .
    .
    .
    # Compile model
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    # Fit the model
    model.fit(X[train], Y[train], epochs=150, batch_size=10, verbose=0)
    # evaluate the model
    scores = model.evaluate(X[test], Y[test], verbose=0)
    print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
    cvscores.append(scores[1] * 100)
print("%.2f%% (+/- %.2f%%)" % (numpy.mean(cvscores), numpy.std(cvscores)))

see this https://machinelearningmastery.com/evaluate-performance-deep-learning-models-keras/

Upvotes: 4

Related Questions