Reputation: 453
Working in colab, with default tensorflow and keras versions (which print tensorflow 2.2.0-rc2, keras 2.3.0-tf )
I've got a superweird error. Basically, the results of model.evaluate() depend on the batch size I'm using and they change after I shuffle the data. Which makes no sense. I've been able to reproduce this in a minimally working example. In my full program (which works in 3D with bigger datasets) the variations are even more significant. I don't know whether this might depend on batch normalization... But I expect it to be fixed when I'm predicting! My full program is doing multiclass segmentation, my minimal example takes a black image with a white square in a random position, with some little noise, and tries to segment the same white square out of it. I'm using keras sequence as generators to feed data to the model, which I guess might be relevant as I don't see the behaviour when evaluating the data directly. Here's the code with its output:
#environment setup
%tensorflow_version 2.x
from tensorflow.keras import backend as K
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input,Conv2D, Activation, BatchNormalization
from tensorflow.keras import metrics
#set up a toy model
K.set_image_data_format("channels_last")
inputL = Input([64,64,1])
l1 = Conv2D(4,[3,3],padding='same')(inputL)
l1N = BatchNormalization(axis=-1,momentum=0.9) (l1)
l2 = Activation('relu') (l1N)
l3 = Conv2D(32,[3,3],padding='same')(l2)
l3N = BatchNormalization(axis=-1,momentum=0.9) (l3)
l4 = Activation('relu') (l3N)
l5 = Conv2D(1,[1,1],padding='same',dtype='float32')(l4)
l6 = Activation('sigmoid') (l5)
model = Model(inputs=inputL,outputs=l6)
model.compile(optimizer='sgd',loss='mse',metrics='accuracy' )
#Create random images
import numpy as np
import random
X_train = np.zeros([96,64,64,1])
for imIdx in range(96):
centPoin = random.randrange(7,50)
X_train[imIdx,centPoin-5:centPoin+5,centPoin-5:centPoin+5,0]=1
X_val = X_train[:32,:,:,:]
X_train = X_train[32:,:,:,:]
Y_train = X_train.copy()
X_train = np.random.normal(0.,0.1,size=X_train.shape)+X_train
for imIdx in range(64):
X_train[imIdx,:,:,:] = X_train[imIdx,:,:,:]+np.random.normal(0,0.2,size=1)
from tensorflow.keras.utils import Sequence
import random
import tensorflow as tf
#setup the data generator
class dataGen (Sequence):
def __init__ (self,x_set,y_set,batch_size):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
nSamples = self.x.shape[0]
patList = np.array(range(nSamples),dtype='int16')
patList = patList.reshape(nSamples,1)
np.random.shuffle(patList)
self.patList = patList
def __len__ (self):
return round(self.patList.shape[0] / self.batch_size)
def __getitem__ (self, idx):
patStart = idx
batchS = self.batch_size
listLen = self.patList.shape[0]
Xout = np.zeros((batchS,64,64,1))
Yout = np.zeros((batchS,64,64,1))
for patIdx in range(batchS):
curPat = (patStart+patIdx) % listLen
patInd = self.patList[curPat]
Xout[patIdx,:,:] = self.x[patInd,:,:,:]
Yout[patIdx,:,:] = self.y[patInd,:,:,:]
return Xout, Yout
def on_epoch_end(self):
np.random.shuffle(self.patList)
def setBatchSize(self,batchS):
self.batch_size = batchS
#load the data in the generator
trainGen = dataGen(X_train,Y_train,16)
valGen = dataGen(X_val,X_val,16)
# train the model for two epochs, so that the loss is bad
trainSteps = len(trainGen)
model.fit(trainGen,steps_per_epoch=trainSteps,epochs=32,validation_data=valGen,validation_steps=len(valGen))
trainGen.setBatchSize(4)
model.evaluate(trainGen)
[0.16259156167507172, 0.9870567321777344]
trainGen.setBatchSize(16)
model.evaluate(trainGen)
[0.17035068571567535, 0.9617958068847656]
trainGen.on_epoch_end()
trainGen.setBatchSize(16)
model.evaluate(trainGen)
[0.16663715243339539, 0.9710426330566406]
If I do model.evaluate(Xtrain,Ytrain,batch_size=16)
instead the result is not dependent from the batch size.
If I train the model until convergence, where the loss gets to 0.05, the same thing still happens. With the accuracy fluctuating from one evaluation to the other from 0.95 to 0.99.
Why would this happen?
I'd expect the prediction to be super easy, am I wrong?
Upvotes: 1
Views: 1442
Reputation: 138
You made a small mistake inside the __getitem__
function.
curPat = (patStart+patIdx)
should be changed to
curPat = (patStart*batchS+patIdx)
patStart
is equal to idx
, the current batch number. If your data set contains 64 samples and your batch size is set to 16, the possible values for idx
will be 0, 1, 2 and 3.
curPat
on the other hand refers to the index of the current sample number in the shuffled list of sample numbers. curPat
should therefore be able to take on all values from 0 to 63. In your code, that is not the case. By making the aforementioned change, this issue is fixed.
Upvotes: 3