jbflow
jbflow

Reputation: 642

Why is keras accuracy and loss not changing between epochs and how to fix

I'm trying to train a model like the following:

input1 = np.array([[2], [1], [4], [3], [5]])
input2 = np.array([[2, 1, 8, 4], [2, 6, 1, 9], [7, 3, 1, 4], [3, 1, 6, 10], [3, 2, 7, 5]])
outputs = np.array([[3,3,1,0], [3,3,3,0], [3,3,4,0], [3,3,1,0], [3,3,4,0]])

merged = np.column_stack([input1, input2])
model = keras.Sequential([
    keras.layers.Dense(2, input_dim=5, activation='relu'),
    keras.layers.Dense(2, activation='relu'),
    keras.layers.Dense(4, activation='sigmoid'),
])

model.compile(
    loss="mean_squared_error", optimizer="adam", metrics=["accuracy"]
)

model.fit(merged, outputs, batch_size=16, epochs = 100)

This results in an accuracy of 0.6000 and a loss of about 4.6 and these don't change between epochs.

Why is this, and how can I get it to work?

I've tried changing the optimizer and loss functions to a few various.

Upvotes: 1

Views: 1333

Answers (2)

jbflow
jbflow

Reputation: 642

OK I have found the reason for my issue, Thanks to the other answers and comments above and some further reading. I found out I need to use OneHotEncoding to convert to binary, and also reduce the batch_size to 1. This is my code now and this does a better job and reduces the loss.

import keras
from keras.backend import batch_normalization
from keras.preprocessing.text import Tokenizer
from keras.optimizers import SGD
from sklearn.preprocessing import OneHotEncoder
import numpy as np

input1 = np.array([[2], [1], [4], [3], [5]])
input2 = np.array([[2, 1, 8, 4], [2, 6, 1, 9], [7, 3, 1, 4], [3, 1, 6, 10], [3, 2, 7, 5]])
outputs = np.array([[3,3,1,0], [3,3,3,0], [3,3,4,0], [3,3,1,0], [3,3,4,0]])
merged = np.column_stack([input1, input2])
ohe = OneHotEncoder()
x = ohe.fit_transform(merged).toarray()
y = ohe.fit_transform(outputs).toarray()


model = keras.Sequential([
    keras.layers.Dense(30, input_dim=20, activation='relu'),
    keras.layers.Dense(20, activation='relu'),
    keras.layers.Dense(15, activation='relu'),
    keras.layers.Dense(10, activation='relu'),
    keras.layers.Dense(6, activation='sigmoid')
    ])

model.compile(loss = "binary_crossentropy", optimizer = 'adam')
model.fit(x, y,  batch_size=1, epochs = 100)

This works and answers the question. But it doesn't appear to actually solve my problem and work for my use case. That's another topic though so I've asked another question.

Upvotes: 0

MD Mushfirat Mohaimin
MD Mushfirat Mohaimin

Reputation: 2066

Your model is too simple to fit to the non-linear data
This model might work out

model = keras.Sequential([
    keras.layers.Dense(20, input_dim=5, activation='relu'),
    keras.layers.Dense(15, activation='relu'),
    keras.layers.Dense(10, activation='relu'),
    keras.layers.Dense(4, activation='relu'),
])

In the final Dense layer, you selected a sigmoid activation function, which has a range of 0 to 1, but your target values are not. This is another reason you are facing low accuracy. So, changing the activation to relu will fix it.

Upvotes: 1

Related Questions