saeed
saeed

Reputation: 25

Neural Networks Using Python and NumPy

I am newbie to NN and I am trying to implement NN with Python/Numpy from the code I found at: "Create a Simple Neural Network in Python from Scratch" enter link description here

My input array is:

array([[5.71, 5.77, 5.94],
   [5.77, 5.94, 5.51],
   [5.94, 5.51, 5.88],
   [5.51, 5.88, 5.73]])

Output array is:

array([[5.51],
   [5.88],
   [5.73],
   [6.41]])

after running the code, I see following results which are not correct:

synaptic_weights after training
[[1.90625275]
[2.54867698]
[1.07698312]]
outputs after training
[[1.]
[1.]
[1.]
[1.]]

Here is the core of the code:

for iteration in range(1000):
    input_layer = tr_input
    outputs = sigmoid(np.dot(input_layer, synapic_weights))

    error = tr_output - outputs

    adjustmnets = error * sigmoid_derivative(outputs)

    synapic_weights +=np.dot(input_layer.T, adjustmnets )

print('synaptic_weights after training')  
print(synapic_weights)

print('outputs after training')  
print(outputs)

What should I change in this code so it works for my data? Or shall I take different method? Any help is highly appreciated.

Upvotes: 1

Views: 259

Answers (2)

Amaldev
Amaldev

Reputation: 983

These are the steps involved in my neural network implementation.

  • Randomly initialize weights (θ Theta)
  • Implement forward propagation
  • Compute cost function
  • Implement back propagation to compute partial derivative
  • Use gradient descent
def forward_prop(X, theta_list):

    m = X.shape[0]
    a_list = []
    z_list = []
    
    a_list.append(np.insert(X, 0, values=np.ones(m), axis=1))
   
    idx = 0
    for idx, thera in enumerate(theta_list):
        z_list.append(a_list[idx] * (theta_list[idx].T))
        if idx != (len(theta_list)-1):
            a_list.append(np.insert(sigmoid(z_list[idx]), 0, values=np.ones(m), axis=1))
        else:
            a_list.append(sigmoid(z_list[idx]))

    return a_list, z_list
def back_prop(params, input_size, hidden_layers, num_labels, X, y, regularization, regularize):

    m = X.shape[0]
    X = np.matrix(X)
    y = np.matrix(y)
    
    theta_list = []
    startCount = 0
    idx = 0
    for idx, val in enumerate(hidden_layers):
        if idx == 0:
            startCount = val * (input_size + 1)
            theta_list.append(np.matrix(np.reshape(params[:startCount], (val, (input_size + 1)))))
        if idx != 0:
            tempCount = startCount
            startCount += (val * (hidden_layers[idx-1] + 1))
            theta_list.append(np.matrix(np.reshape(params[tempCount:startCount], (val, (hidden_layers[idx-1] + 1)))))
        if idx == (len(hidden_layers)-1):
            theta_list.append(np.matrix(np.reshape(params[startCount:], (num_labels, (val + 1)))))


    a_list, z_list= forward_prop(X, theta_list)
    J = cost(X, y, a_list[len(a_list)-1], theta_list, regularization, regularize)
    
    d_list = []
    d_list.append(a_list[len(a_list)-1] - y)
    
    idx = 0
    while idx < (len(theta_list)-1):
        d_temp = np.multiply(d_list[idx] * theta_list[len(a_list) - 2 - idx], sigmoid_gradient(a_list[len(a_list) - 2 - idx]))
        d_list.append(d_temp[:,1:])
        idx += 1    
    
    delta_list = []
    for theta in theta_list:
        delta_list.append(np.zeros(theta.shape))

    for idx, delta in enumerate(delta_list):
        delta_list[idx] = delta_list[idx] + ((d_list[len(d_list) - 1 -idx].T) * a_list[idx])
        delta_list[idx] = delta_list[idx] / m
   
    if regularize:
        for idx, delta in enumerate(delta_list):
            delta_list[idx][:, 1:] = delta_list[idx][:, 1:] + (theta_list[idx][:, 1:] * regularization)

    grad_list = np.ravel(delta_list[0])
    idx = 1
    while idx < (len(delta_list)):
        grad_list = np.concatenate((grad_list, np.ravel(delta_list[idx])), axis=None)
        idx += 1

    return J, grad_list
def cost(X, y, h, theta_list, regularization, regularize):

    m = X.shape[0]
    X = np.matrix(X)
    y = np.matrix(y)

    J = (np.multiply(-y, np.log(h)) - np.multiply((1 - y), np.log(1 - h))).sum() / m
        
    if regularize:
        regularization_value = 0.0
        for theta in theta_list:
            regularization_value += np.sum(np.power(theta[:, 1:], 2))
        J += (float(regularization) / (2 * m)) * regularization_value
        

    return J

Implementation

Upvotes: 0

aminrd
aminrd

Reputation: 5070

That's because you are using a wrong activation function (i.e. sigmoid). The main reason why we use sigmoid function is because it exists between (0 to 1). Therefore, it is especially used for models where we have to predict the probability as an output.Since probability of anything exists only between the range of 0 and 1, sigmoid is the right choice.

If you want to train a model to predict the values in your array, you should use a regression model. Otherwise, you can convert your output into labels (for example are 5.x to 0 and 6.x to 1) and retrain your model.

Upvotes: 1

Related Questions