Reputation: 73
I'm trying to use Convolutional Neural Network (CNN) for image classification. And I want to use KFold Cross Validation for data train and test. I'm new for this and I don't really understand how to do it.
I've tried KFold Cross Validation and CNN in separate code. And I don't know how to combine it.
I'm using iris_data.csv with 3 classes as the example for input.
import pandas as pd
import numpy as np
from sklearn.model_selection import KFold
from sklearn.preprocessing import MinMaxScaler
from sklearn.svm import SVR
dataset = pd.read_csv('iris_data.csv')
x = dataset.iloc[:,0:3]
y = dataset.iloc[:, 4]
scaler = MinMaxScaler(feature_range=(0, 1))
x = scaler.fit_transform(x)
cv = KFold(n_splits=10, shuffle=False)
for train_index, test_index in cv.split(x):
print("Train Index: ", train_index, "\n")
print("Test Index: ", test_index)
x_train, x_test, y_train, y_test = x[train_index], x[test_index], y[train_index], y[test_index]
And here the CNN code example.
import numpy as np
import tensorflow as tf
from keras.models import Model
from keras.layers import Input, Activation, Dense, Conv2D, MaxPooling2D, Flatten
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import Adam
from keras.callbacks import TensorBoard
# Images Dimensions
img_width, img_height = 200, 200
# Data Path
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
# Parameters
nb_train_samples = 100
nb_validation_samples = 50
epochs = 50
batch_size = 10
# TensorBoard Callbacks
callbacks = TensorBoard(log_dir='./Graph')
# Training Data Augmentation
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
# Rescale Testing Data
test_datagen = ImageDataGenerator(rescale=1. / 255)
# Train Data Generator
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')
# Testing Data Generator
validation_generator = test_datagen.flow_from_directory(
validation_data_dir,
target_size=(img_width, img_height),
batch_size=batch_size,
class_mode='categorical')
# Feature Extraction Layer KorNet
inputs = Input(shape=(img_width, img_height, 3))
conv_layer = Conv2D(16, (5, 5), strides=(3,3), activation='relu')(inputs)
conv_layer = MaxPooling2D((2, 2))(conv_layer)
conv_layer = Conv2D(32, (5, 5), strides=(3,3), activation='relu')(conv_layer)
conv_layer = MaxPooling2D((2, 2))(conv_layer)
# Flatten Layer
flatten = Flatten()(conv_layer)
# Fully Connected Layer
fc_layer = Dense(32, activation='relu')(flatten)
outputs = Dense(3, activation='softmax')(fc_layer)
model = Model(inputs=inputs, outputs=outputs)
# Adam Optimizer and Cross Entropy Loss
adam = Adam(lr=0.0001)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])
# Print Model Summary
print(model.summary())
model.fit_generator(
train_generator,
steps_per_epoch=nb_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=nb_validation_samples // batch_size,
callbacks=[callbacks])
model.save('./models/model.h5')
model.save_weights('./models/weights.h5')
I want the result of KFold Cross Validation used as the training and testing data for CNN.
Upvotes: 6
Views: 9249
Reputation: 64
Ok you have to understand how K-fold method works first, and I will assume you do, if you don't you can review this resource to get a deeper understanding:
https://vitalflux.com/k-fold-cross-validation-python-example/
So according to your dataset you need to divide your dataset into k folds, I recommend you either look at previous notebooks and see how many folds they had divided their dataset into, or if the dataset is small use a large number of folds and vice versa.
Last step before applying K-fold is to split your dataset into X & Y, then :
X = np_image_list
y = image_labels
At this part you may initialize a list to calculate the average of each parameter like: Accuracy, Recall..
Train_accuracy = []
Test_accuracy = []
from sklearn.model_selection import KFold
k =10
kf = KFold(n_splits=k, random_state=True, shuffle=True)
for train_index, test_index in kf.split(X) :
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
#Your Entire model
#then your evaluation to the model "Accuracy, recall..etc"
#append the calculated accuracy to your list
t_accuracy = accuracy_score(y_train,Model_pred_train)*100
Train_accuracy.append(train_accuacy)
Finally take the average: if number of folds say k= 10, then
print("Average Train Accuracy = "+ str(Train_accuracy/k))
You can look at my code I created a tutorial here for applying K-fold method on image classification model. https://github.com/ZienabEsam/Image-Classifcation-using-K-fold-method
using AlexNet, that is a CNN.
Upvotes: 0
Reputation: 872
just do something like this
from keras.models import Sequential
from sklearn.model_selection import KFold
import numpy
dataset = numpy.loadtxt("iris_data.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:3]
Y = dataset[:,4]
# define 10-fold cross validation test harness
kfold = KFold(n_splits=10, shuffle=True, random_state=seed)
cvscores = []
for train, test in kfold.split(X, Y):
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
.
.
.
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X[train], Y[train], epochs=150, batch_size=10, verbose=0)
# evaluate the model
scores = model.evaluate(X[test], Y[test], verbose=0)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
cvscores.append(scores[1] * 100)
print("%.2f%% (+/- %.2f%%)" % (numpy.mean(cvscores), numpy.std(cvscores)))
see this https://machinelearningmastery.com/evaluate-performance-deep-learning-models-keras/
Upvotes: 4