Reputation: 15
I used two online references to construct a neural network in python with four input nodes, a layer of 4 hidden nodes, and 6 output nodes. When I run the network, the loss increases rather than decreasing which I believe means its predictions are getting worse.
Sorry for the ocean of code, I have no idea where in the code the issue could be. Nothing that I did has been able to fix this. Is there something wrong with my code, or is my assumption about the loss function wrong?
import numpy as np
#defining inputs and real outputs
inputData = np.array([[10.0, 5.0, 15.0, 3.0],
[9.0, 6.0, 16.0, 4.0],
[8.0, 4.0, 17.0, 5.0],
[7.0, 3.0, 18.0, 6.0],
[6.0, 2.0, 19.0, 7.0]])
statsReal = np.array([[0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
[0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
[0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
[0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
[0.0, 0.2, 0.4, 0.6, 0.8, 1.0]])
def sigmoid(x):
return 1/(1 + np.exp(-x))
def sigmoid_d_dx(x):
return sigmoid(x) * (1 - sigmoid(x))
def softmax(A):
expA = np.exp(A)
return expA / expA.sum(axis=1, keepdims=True)
#defining the hidden and output nodes, and the weights and biases for hidden and output layers
instances = inputData.shape[0]
attributes = inputData.shape[1]
hidden_nodes = 4
output_nodes = 6
wh = np.random.rand(attributes,hidden_nodes)
bh = np.random.randn(hidden_nodes)
wo = np.random.rand(hidden_nodes,output_nodes)
bo = np.random.randn(output_nodes)
learningRate = 10e-4
error_cost = []
for epoch in range(100):
#Feedforward Phase 1
zh = np.dot(inputData, wh) + bh
ah = sigmoid(zh)
#Feedforward Phase 2
zo = np.dot(ah, wo) + bo
ao = softmax(zo)
#Backpropogation Phase 1
dcost_dzo = ao - statsReal
dzo_dwo = ah
dcost_wo = np.dot(dzo_dwo.T, dcost_dzo)
dcost_bo = dcost_dzo
#Backpropogation Phase 2
dzo_dah = wo
dcost_dah = np.dot(dcost_dzo, dzo_dah.T)
dah_dzh = sigmoid_d_dx(zh)
dzh_dwh = inputData
dcost_wh = np.dot(dzh_dwh.T, dah_dzh * dcost_dah)
dcost_bh = dcost_dah*dah_dzh
#Weight Updates
wh -= learningRate * dcost_wh
bh -= learningRate * dcost_bh.sum(axis=0)
wo -= learningRate * dcost_wo
bo -= learningRate * dcost_bo.sum(axis=0)
loss = np.sum(-statsReal * np.log(ao))
print(loss)
error_cost.append(loss)
print(error_cost)```
Upvotes: 1
Views: 145
Reputation: 5036
Your network is learning when you train with reasonable data.
Try this data for example. I added one distinct case for every class and one hot encoded the targets. I scaled the inputs to [0.0, 1.0]
inputData = np.array([[1.0, 0.5, 0.0, 0.0],
[0.0, 1.0, 0.5, 0.0],
[1.0, 0.0, 0.0, 1.0],
[0.0, 1.0, 0.0, 0.5],
[0.0, 0.0, 0.0, 1.0],
[1.0, 1.0, 0.5, 0.0]])
statsReal = np.array([[1.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 1.0, 0.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 1.0, 0.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 1.0, 0.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 1.0, 0.0],
[0.0, 0.0, 0.0, 0.0, 0.0, 1.0]])
Increase the learning rate
learningRate = 10e-2
Train for more epochs and print a little bit less often.
for epoch in range(1000):
#....
if epoch % 100 == 99: print(loss)
Output of your loss function
6.116573523774877
2.6901680150532847
1.323221228926058
0.7688474199923144
0.5186915091033664
0.38432651801528794
0.3024486736712547
0.24799685736356275
0.20944414625474833
0.1808455098847857
Upvotes: 1