Hans-Ulrich Rudel
Hans-Ulrich Rudel

Reputation: 47

Why is accuracy and loss staying exactly the same while training?

So I have tried to modify the entry-tutorial from https://www.tensorflow.org/tutorials/keras/basic_classification, to work with my own data. The goal is to classify images of dogs and cats. The code is very simple and given below. The problem is that the network does not seem to learn at all, training loss and accuracy stay the same after every epoch.

The images (X_training) and the labels (y_training) seem to have the right format: X_training.shape returns: (18827, 80, 80, 3)

y_training is a one dimensional list with entries in {0,1}

I have checked several times, that the "images" in X_training are correctly labeled: Let's say X_training[i,:,:,:] represents a dog, then y_training[i] will return a 1, if X_training[i,:,:,:] represents a cat, then y_training[i] will return a 0.

Shown below is the complete python file without the import statements.

#loading the data from 4 pickle files:
pickle_in = open("X_training.pickle","rb")
X_training = pickle.load(pickle_in)

pickle_in = open("X_testing.pickle","rb")
X_testing = pickle.load(pickle_in)

pickle_in = open("y_training.pickle","rb")
y_training = pickle.load(pickle_in)

pickle_in = open("y_testing.pickle","rb")
y_testing = pickle.load(pickle_in)


#normalizing the input data:
X_training = X_training/255.0
X_testing = X_testing/255.0


#building the model:
model = keras.Sequential([
    keras.layers.Flatten(input_shape=(80, 80,3)),
    keras.layers.Dense(128, activation=tf.nn.relu),
    keras.layers.Dense(1,activation='sigmoid')
])
model.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy'])


#running the model:
model.fit(X_training, y_training, epochs=10)

The code compiles and trains for 10 epochs, but neither loss nor accuracy improve, they stay exactly the same after every epoch. The code works fine with the MNIST-fashion dataset used in the tutorial with slight changes accounting for the difference in multiclass vs binary classification and input shape.

Upvotes: 3

Views: 2670

Answers (2)

Hans-Ulrich Rudel
Hans-Ulrich Rudel

Reputation: 47

I have found a different implementation with a slightly more complex model that works. Here is the complete code without the import statements:

#global variables:
batch_size = 32
nr_of_epochs = 64
input_shape = (80,80,3)


#loading the data from 4 pickle files:
pickle_in = open("X_training.pickle","rb")
X_training = pickle.load(pickle_in)

pickle_in = open("X_testing.pickle","rb")
X_testing = pickle.load(pickle_in)

pickle_in = open("y_training.pickle","rb")
y_training = pickle.load(pickle_in)

pickle_in = open("y_testing.pickle","rb")
y_testing = pickle.load(pickle_in)



#building the model
def define_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=input_shape))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # compile model
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    return model
model = define_model()


#Possibility for image data augmentation
train_datagen = ImageDataGenerator(rescale=1.0/255.0)
val_datagen = ImageDataGenerator(rescale=1./255.) 
train_generator =train_datagen.flow(X_training,y_training,batch_size=batch_size)
val_generator = val_datagen.flow(X_testing,y_testing,batch_size= batch_size)



#running the model
history = model.fit_generator(train_generator,steps_per_epoch=len(X_training) //batch_size,
                              epochs=nr_of_epochs,validation_data=val_generator,
                              validation_steps=len(X_testing) //batch_size)

Upvotes: 1

Ioannis Nasios
Ioannis Nasios

Reputation: 8527

if you want to train a classification model you must have binary_crossentropy as you lost function and not mean_squared_error which is used for regression tasks

replace

model.compile(optimizer='adam',loss='mean_squared_error',metrics=['accuracy'])

with

model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])

Furthermore i would recommend not using relu activation on your dense layer but linear

replace

keras.layers.Dense(128, activation=tf.nn.relu),

with

keras.layers.Dense(128),

and of cource to better use the power of neural networks use some convolutional layers prior your flatten layer

Upvotes: 1

Related Questions