Neural Network model not improving accuracy. Scaling problem or model problem?

Question

I am trying to create a simple neural network to see how it works.

A second degree ecuation has the form (x-x1)*(x-x2)=0, and if you rearrange it, it will become ax^2+bx+c=0, where a=1, b=-2*x1*x2, c=x1*x2. I want to create a neural network where the inputs are (a,b), and the outputs are (x1,x2).

In order to do this, I have created 2 functions that create the data, and stored them in matrices named input and output.

I have created a neural network with layers 2x2x2 (including inputs and output), and tested it out with bad results, even after tweaking it.

I guess the issue that I have is regarding the data, because the neural network works and spits out a result, but its not good.

I don't know where the issue is, but my guess is that it has to do with the data scaling. I have tried to introduce the data without scaling it, but I get the same bad results.

The idea is that I provide enough training, so the weights and biases are such that provided any input data, the outcome will be very close to the desired output.

This is the code of the whole program

import keras
from keras import backend as K
from keras.models import Sequential
from keras.models import load_model
from keras.layers import Dense, Activation
from keras.layers.core import Dense
from keras.optimizers import SGD
from keras.metrics import categorical_crossentropy
from sklearn.metrics import  confusion_matrix
import itertools

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'

from random import randint
from sklearn.preprocessing import MinMaxScaler


import numpy as np

def abc(x1, x2):
    b=-2*x1*x2
    c=x1*x2
    sol=[b,c]
    return sol

a=10
b=10
c=a*b


def Nx2(N, M):
    matrix=[]
    n = N+ 1
    m= M + 1
    for i in range(1,n):
        for j in range(1,m):
            temp=[i,j]
            matrix.append(temp)
    final_matrix = np.array(matrix)
    return final_matrix

output=Nx2(a, b)

# print(output)

input=[]
for i in range(0,c):
    temp2=abc(output[i,0],output[i,1])
    input.append(temp2)
input=np.array(input)

print(input)

train_labels = output
train_samples = input

scaler = MinMaxScaler(feature_range=(0,1))
scaled_train_samples = scaler.fit_transform((train_samples).reshape(-1,1))
scaled_train_samples=scaled_train_samples.reshape(-1,2)

scaler = MinMaxScaler(feature_range=(0,1))
scaled_train_labels = scaler.fit_transform((train_labels).reshape(-1,1))
scaled_train_labels=scaled_train_labels.reshape(-1,2)

print(scaled_train_samples)
print(scaled_train_labels)

model = Sequential([
    Dense(2, input_shape=(2,), activation='sigmoid'),
    Dense(2, activation='sigmoid'),
])

print(model.weights)

model.compile(SGD(lr=0.01), loss='mean_squared_error', metrics=['accuracy'])
model.fit(scaled_train_labels, scaled_train_labels, validation_split=0.2, batch_size=10, epochs=20, shuffle=True, verbose=2)

print(model.summary())
print(model.weights)

These are the kind of results that I am getting.

 Epoch 1/20 
     - 0s - loss: 0.1456 - accuracy: 0.5500 - val_loss: 0.3715 - val_accuracy: 0.0500 Epoch 2/20
     - 0s - loss: 0.1449 - accuracy: 0.5500 - val_loss: 0.3704 - val_accuracy: 0.0500 Epoch 3/20
     - 0s - loss: 0.1443 - accuracy: 0.5500 - val_loss: 0.3692 - val_accuracy: 0.0500 Epoch 4/20
     - 0s - loss: 0.1437 - accuracy: 0.5500 - val_loss: 0.3681 - val_accuracy: 0.0500 Epoch 5/20
     - 0s - loss: 0.1431 - accuracy: 0.5500 - val_loss: 0.3670 - val_accuracy: 0.0500 Epoch 6/20
     - 0s - loss: 0.1425 - accuracy: 0.5500 - val_loss: 0.3658 - val_accuracy: 0.0500 Epoch 7/20
     - 0s - loss: 0.1419 - accuracy: 0.5500 - val_loss: 0.3647 - val_accuracy: 0.0500 Epoch 8/20
     - 0s - loss: 0.1413 - accuracy: 0.5500 - val_loss: 0.3636 - val_accuracy: 0.0500 Epoch 9/20
     - 0s - loss: 0.1407 - accuracy: 0.5500 - val_loss: 0.3625 - val_accuracy: 0.0500 Epoch 10/20
     - 0s - loss: 0.1401 - accuracy: 0.5500 - val_loss: 0.3613 - val_accuracy: 0.0500 Epoch 11/20
     - 0s - loss: 0.1395 - accuracy: 0.5500 - val_loss: 0.3602 - val_accuracy: 0.0500 Epoch 12/20
     - 0s - loss: 0.1389 - accuracy: 0.5500 - val_loss: 0.3591 - val_accuracy: 0.0500 Epoch 13/20
     - 0s - loss: 0.1383 - accuracy: 0.5500 - val_loss: 0.3580 - val_accuracy: 0.0500 Epoch 14/20
     - 0s - loss: 0.1377 - accuracy: 0.5500 - val_loss: 0.3568 - val_accuracy: 0.0500 Epoch 15/20
     - 0s - loss: 0.1372 - accuracy: 0.5500 - val_loss: 0.3557 - val_accuracy: 0.0500 Epoch 16/20
     - 0s - loss: 0.1366 - accuracy: 0.5500 - val_loss: 0.3546 - val_accuracy: 0.0500 Epoch 17/20
     - 0s - loss: 0.1360 - accuracy: 0.5500 - val_loss: 0.3535 - val_accuracy: 0.0500 Epoch 18/20
     - 0s - loss: 0.1354 - accuracy: 0.5500 - val_loss: 0.3524 - val_accuracy: 0.0500 Epoch 19/20
     - 0s - loss: 0.1348 - accuracy: 0.5500 - val_loss: 0.3513 - val_accuracy: 0.0500 Epoch 20/20
     - 0s - loss: 0.1342 - accuracy: 0.5500 - val_loss: 0.3502 - val_accuracy: 0.0500

Can someone point me into the right direction?

Thank you

desertnaut · Accepted Answer

There are several issues with your code:

You are using accuracy for a regression problem, which is meaningless (accuracy is only applicable to classification problems). You should monitor the performance of your model only with the loss, here MSE (for the same reason, you don't need to import confusion_matrix or categorical_crossentropy).
You erroneously use sigmoid activation for your last layer; in regression problems, this should be linear (or leave blank, as linear is the default Keras activation).
You should use the relu activation for your intermediate layers, not sigmoid.
Your model looks way too simple, and it is unclear why you feel constrained to just use 2-node layers (you should not, except of course for the output one).

All in all, here is a starting point:

model = Sequential([
    Dense(30, input_shape=(2,), activation='relu'),
    # Dense(10, activation='relu'), # uncomment for experimentation
    Dense(2, activation='linear'),
])

model.compile(SGD(lr=0.01), loss='mean_squared_error')

but the code word here is experimentation...

Last but not least, you seem to have a typo in your model.fit() (you pass two times the labels, instead of the samples) - be sure to fix this, too.

Neural Network model not improving accuracy. Scaling problem or model problem?

Answers (1)

Related Questions