How can I get reproducible results in keras for a convolutional neural network using data augmentation for image classification?

Question

If I train the same convolutional neural network model architecture (on the same data) twice, clearing the session between runs, I get different results.

I've set random seeds and thread config as follows:

import numpy as np
from numpy.random import seed
import pandas as pd
import random as rn
import os

seed_num = 1
os.environ['PYTHONHASHSEED'] = '0'
np.random.seed(seed_num)
rn.seed(seed_num)

import tensorflow as tf
from tensorflow.keras.models import load_model
from tensorflow.compat.v1.keras import backend as K

session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
tf.random.set_seed(seed_num)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)

...and I've specified the seed when running flow_from_directory:

train_data_gen_aug_rotate = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, 
                                                                 validation_split=0.1,
                                                                 rotation_range=45)

train_img = train_data_gen_aug_rotate.flow_from_directory(data_path, 
                                               subset='training',
                                               color_mode='rgb', 
                                               target_size=target_size,
                                               batch_size=batch_size, 
                                               class_mode='categorical',
                                               seed=seed_num)

Other information in case its useful for answering the question:

The model architecture:

inputs = tf.keras.layers.Input(shape=num_pixels_and_channels)
conv = tf.keras.layers.Conv2D(filters=64, kernel_size=(3,3), padding='SAME', activation='relu')(inputs)
pool = tf.keras.layers.AveragePooling2D(pool_size=(2,2), strides=(2,2), padding='SAME')(conv)
batnorm = tf.keras.layers.BatchNormalization()(pool)
flattened = tf.keras.layers.Flatten()(batnorm)
dense = tf.keras.layers.Dense(257)(flattened)
outputs = tf.keras.layers.Softmax()(dense)

my_model = tf.keras.Model(inputs, outputs)

I'm compiling the model as follows:

model_name.compile(loss='categorical_crossentropy',  
                optimizer=tf.keras.optimizers.Adam(0.001),
                metrics=['accuracy']
               )

I'm using train_on_batch and test_on_batch:

# get next batch of images & labels
X_imgs, X_labels = next(train_img) 

#train model, get cross entropy & accuracy for batch
train_CE, train_acc = model_name.train_on_batch(X_imgs, X_labels)

# validation images - just predict
X_imgs_val, X_labels_val = next(val_img)
val_CE, val_acc = model_name.test_on_batch(X_imgs_val, X_labels_val)

I'm clearing the session between running each model, with tf.keras.backend.clear_session()

I'm working on a mac with a single CPU, in a Jupyter Notebook, using tensorflow version 2.1.0.

I've asked whether the glorot_uniform kernel_initializer in Conv2D uses the same seed as set in tf.random.set_seed() and the single answer provided so far (at the time of writing) says "yes".

What else do I need to do to get the same result for the same model architecture? Is there a random seed related to ImageDataGenerator, train_on_batch, test_on_batch, or the Adam optimizer that doesn't use the tensorflow seed that I've set? Or another part of the code that needs to have the seed specified separately?

code_to_joy · Accepted Answer

I've now got reproducible results (with the initial random weights being the same for each experiment and thereby ensuring that any difference in results is due to the differences between experiments rather than due to different initial weights) by:

1) Clearing the session and setting the random seeds and tf session configs before each experiment:

tf.keras.backend.clear_session()

seed_num = 1
os.environ['PYTHONHASHSEED'] = '0'
np.random.seed(seed_num)
rn.seed(seed_num)
tf.random.set_seed(seed_num)
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)

2) Running the ImageDataGenerator and flow_from_directory code again before each experiment (to ensure both start from the beginning of the random number sequence for the seed upon starting the training for the next model)

So my code from the beginning of the notebook until the 1st experiment is:

import numpy as np
from numpy.random import seed
import pandas as pd
import random as rn
import os

seed_num = 1
os.environ['PYTHONHASHSEED'] = '0'
np.random.seed(seed_num)
rn.seed(seed_num)

import tensorflow as tf
from tensorflow.keras.models import load_model
from tensorflow.compat.v1.keras import backend as K

session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
tf.random.set_seed(seed_num)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)

...then for each experiment, before model architecture definition and compiling the model, it is:

tf.keras.backend.clear_session()

seed_num = 1
os.environ['PYTHONHASHSEED'] = '0'
np.random.seed(seed_num)
rn.seed(seed_num)
tf.random.set_seed(seed_num)
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
K.set_session(sess)


batch_size = 80  
target_size=(64,64) 
num_pixels_and_channels = (64,64,3) 

train_data_gen_aug_rotate = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, 
                                                                 validation_split=0.1,
                                                                 rotation_range=45)

train_img = train_data_gen_aug_rotate.flow_from_directory(data_path, 
                                               subset='training',
                                               color_mode='rgb', 
                                               target_size=target_size,
                                               batch_size=batch_size, 
                                               class_mode='categorical',
                                               seed=seed_num)

val_data_gen_aug_rotate = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, 
                                                               validation_split=0.1)


val_img = val_data_gen_aug_rotate.flow_from_directory(data_path, 
                                           subset='validation',
                                           color_mode='rgb',
                                           target_size=target_size,
                                           batch_size=batch_size,
                                           class_mode='categorical',
                                           seed=seed_num)

I don't know if this is overkill and there's a more efficient way of doing it but this works for me on my laptop with a single CPU. (Running on a GPU introduces additional variability)

How can I get reproducible results in keras for a convolutional neural network using data augmentation for image classification?

Answers (1)

Related Questions