Reputation: 642
I'm trying to train a model like the following:
input1 = np.array([[2], [1], [4], [3], [5]])
input2 = np.array([[2, 1, 8, 4], [2, 6, 1, 9], [7, 3, 1, 4], [3, 1, 6, 10], [3, 2, 7, 5]])
outputs = np.array([[3,3,1,0], [3,3,3,0], [3,3,4,0], [3,3,1,0], [3,3,4,0]])
merged = np.column_stack([input1, input2])
model = keras.Sequential([
keras.layers.Dense(2, input_dim=5, activation='relu'),
keras.layers.Dense(2, activation='relu'),
keras.layers.Dense(4, activation='sigmoid'),
])
model.compile(
loss="mean_squared_error", optimizer="adam", metrics=["accuracy"]
)
model.fit(merged, outputs, batch_size=16, epochs = 100)
This results in an accuracy of 0.6000 and a loss of about 4.6 and these don't change between epochs.
Why is this, and how can I get it to work?
I've tried changing the optimizer and loss functions to a few various.
Upvotes: 1
Views: 1333
Reputation: 642
OK I have found the reason for my issue, Thanks to the other answers and comments above and some further reading. I found out I need to use OneHotEncoding to convert to binary, and also reduce the batch_size to 1. This is my code now and this does a better job and reduces the loss.
import keras
from keras.backend import batch_normalization
from keras.preprocessing.text import Tokenizer
from keras.optimizers import SGD
from sklearn.preprocessing import OneHotEncoder
import numpy as np
input1 = np.array([[2], [1], [4], [3], [5]])
input2 = np.array([[2, 1, 8, 4], [2, 6, 1, 9], [7, 3, 1, 4], [3, 1, 6, 10], [3, 2, 7, 5]])
outputs = np.array([[3,3,1,0], [3,3,3,0], [3,3,4,0], [3,3,1,0], [3,3,4,0]])
merged = np.column_stack([input1, input2])
ohe = OneHotEncoder()
x = ohe.fit_transform(merged).toarray()
y = ohe.fit_transform(outputs).toarray()
model = keras.Sequential([
keras.layers.Dense(30, input_dim=20, activation='relu'),
keras.layers.Dense(20, activation='relu'),
keras.layers.Dense(15, activation='relu'),
keras.layers.Dense(10, activation='relu'),
keras.layers.Dense(6, activation='sigmoid')
])
model.compile(loss = "binary_crossentropy", optimizer = 'adam')
model.fit(x, y, batch_size=1, epochs = 100)
This works and answers the question. But it doesn't appear to actually solve my problem and work for my use case. That's another topic though so I've asked another question.
Upvotes: 0
Reputation: 2066
Your model is too simple to fit to the non-linear data
This model might work out
model = keras.Sequential([
keras.layers.Dense(20, input_dim=5, activation='relu'),
keras.layers.Dense(15, activation='relu'),
keras.layers.Dense(10, activation='relu'),
keras.layers.Dense(4, activation='relu'),
])
In the final Dense
layer, you selected a sigmoid
activation function, which has a range of 0 to 1, but your target values are not. This is another reason you are facing low accuracy. So, changing the activation to relu
will fix it.
Upvotes: 1