RGBCoco
RGBCoco

Reputation: 15

Neural Network increases loss instead of decreasing it

I used two online references to construct a neural network in python with four input nodes, a layer of 4 hidden nodes, and 6 output nodes. When I run the network, the loss increases rather than decreasing which I believe means its predictions are getting worse.

Sorry for the ocean of code, I have no idea where in the code the issue could be. Nothing that I did has been able to fix this. Is there something wrong with my code, or is my assumption about the loss function wrong?

import numpy as np

#defining inputs and real outputs
inputData = np.array([[10.0, 5.0, 15.0, 3.0],
                      [9.0, 6.0, 16.0, 4.0],
                      [8.0, 4.0, 17.0, 5.0],
                      [7.0, 3.0, 18.0, 6.0],
                      [6.0, 2.0, 19.0, 7.0]])

statsReal = np.array([[0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
                      [0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
                      [0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
                      [0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
                      [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]])


def sigmoid(x):
    return 1/(1 + np.exp(-x))

def sigmoid_d_dx(x):
    return sigmoid(x) * (1 - sigmoid(x))

def softmax(A):
    expA = np.exp(A)
    return expA / expA.sum(axis=1, keepdims=True)

#defining the hidden and output nodes, and the weights and biases for hidden and output layers
instances = inputData.shape[0]
attributes = inputData.shape[1]
hidden_nodes = 4
output_nodes = 6

wh = np.random.rand(attributes,hidden_nodes)
bh = np.random.randn(hidden_nodes)

wo = np.random.rand(hidden_nodes,output_nodes)
bo = np.random.randn(output_nodes)
learningRate = 10e-4

error_cost = []

for epoch in range(100):

    #Feedforward Phase 1
    zh = np.dot(inputData, wh) + bh
    ah = sigmoid(zh)

    #Feedforward Phase 2
    zo = np.dot(ah, wo) + bo
    ao = softmax(zo)
    
    #Backpropogation Phase 1
    dcost_dzo = ao - statsReal
    dzo_dwo = ah
    dcost_wo = np.dot(dzo_dwo.T, dcost_dzo)
    dcost_bo = dcost_dzo

    #Backpropogation Phase 2
    dzo_dah = wo
    dcost_dah = np.dot(dcost_dzo, dzo_dah.T)
    dah_dzh = sigmoid_d_dx(zh)
    dzh_dwh = inputData
    dcost_wh = np.dot(dzh_dwh.T, dah_dzh * dcost_dah)
    dcost_bh = dcost_dah*dah_dzh

    #Weight Updates
    wh -= learningRate * dcost_wh
    bh -= learningRate * dcost_bh.sum(axis=0)
    wo -= learningRate * dcost_wo
    bo -= learningRate * dcost_bo.sum(axis=0)

    loss = np.sum(-statsReal * np.log(ao))
    print(loss)
    error_cost.append(loss)

print(error_cost)```

Upvotes: 1

Views: 145

Answers (1)

Michael Szczesny
Michael Szczesny

Reputation: 5036

Your network is learning when you train with reasonable data.

Try this data for example. I added one distinct case for every class and one hot encoded the targets. I scaled the inputs to [0.0, 1.0]

inputData = np.array([[1.0, 0.5, 0.0, 0.0],
                      [0.0, 1.0, 0.5, 0.0],
                      [1.0, 0.0, 0.0, 1.0],
                      [0.0, 1.0, 0.0, 0.5],
                      [0.0, 0.0, 0.0, 1.0],
                      [1.0, 1.0, 0.5, 0.0]])

statsReal = np.array([[1.0, 0.0, 0.0, 0.0, 0.0, 0.0],
                      [0.0, 1.0, 0.0, 0.0, 0.0, 0.0],
                      [0.0, 0.0, 1.0, 0.0, 0.0, 0.0],
                      [0.0, 0.0, 0.0, 1.0, 0.0, 0.0],
                      [0.0, 0.0, 0.0, 0.0, 1.0, 0.0],
                      [0.0, 0.0, 0.0, 0.0, 0.0, 1.0]])

Increase the learning rate

learningRate = 10e-2

Train for more epochs and print a little bit less often.

for epoch in range(1000):
#....
    if epoch % 100 == 99: print(loss)

Output of your loss function

6.116573523774877
2.6901680150532847
1.323221228926058
0.7688474199923144
0.5186915091033664
0.38432651801528794
0.3024486736712547
0.24799685736356275
0.20944414625474833
0.1808455098847857

Upvotes: 1

Related Questions