Wrong dimensions XOR neural network python

Question

I'm trying to build an XOR neural network in python with one hidden layer but I'm hitting a problem with dimensions and I can't figure out why I'm getting the wrong dimensions in the first place because the math looks correct to me.

The dimensions issue starts in the backpropagation part and is commented. The error specifically is

  File "nn.py", line 52, in 
    d_a1_d_W1 = inp * deriv_sigmoid(z1) 
  File "/usr/local/lib/python3.7/site-packages/numpy/matrixlib/defmatrix.py", line 220, in __mul__
    return N.dot(self, asmatrix(other))
ValueError: shapes (1,2) and (3,1) not aligned: 2 (dim 1) != 3 (dim 0)

Additionally, why does the sigmoid_derivative function here only work if I cast to a numpy array?

Code:


import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def deriv_sigmoid(x):

  fx = np.array(sigmoid(x)) # gives dimensions issues unless I cast to array
  return fx * (1 - fx)

hiddenNeurons = 3
outputNeurons = 1
inputNeurons = 2

X = np.array( [ [0, 1]  ])
elem = np.matrix(X[0])
elem_row, elem_col = elem.shape


y = np.matrix([1])

W1 = np.random.rand(hiddenNeurons, elem_col)
b1 = np.random.rand(hiddenNeurons, 1)
W2 = np.random.rand(outputNeurons, hiddenNeurons)
b2 = np.random.rand(outputNeurons, 1)
lr = .01



for inp, ytrue in zip(X, y):
    inp = np.matrix(inp)

    # feedforward
    z1 = W1 * inp.T + b1 # get weight matrix1 * inputs + bias1
    a1 = sigmoid(z1) # get activation of hidden layer

    z2 = W2 * a1 + b2 # get weight matrix2 * activated hidden + bias 2
    a2 = sigmoid(z2) # get activated output 
    ypred = a2 # and call it ypred (y prediction)

    # backprop
    d_L_d_ypred = -2 * (ytrue - ypred) # derivative of mean squared error loss

    d_ypred_d_W2 = a1 * deriv_sigmoid(z2) # deriviative of y prediction with respect to weight matrix 2
    d_ypred_d_b2 = deriv_sigmoid(z2) # deriviative of y prediction with respect to bias 2

    d_ypred_d_a1 = W2 * deriv_sigmoid(z2) # deriviative of y prediction with respect to hidden activation

    d_a1_d_W1 = inp * deriv_sigmoid(z1) # dimensions issue starts here ––––––––––––––––––––––––––––––––

    d_a1_d_b1 = deriv_sigmoid(b1) 

    W1 -= lr * d_L_d_ypred * d_ypred_d_a1 * d_a1_d_W1
    b1 -= lr * d_L_d_ypred * d_ypred_d_a1 * d_a1_d_b1
    W2 -= lr * d_L_d_ypred * d_ypred_d_W2
    b2 -= lr * d_L_d_ypred * d_ypred_d_b2

Wrong dimensions XOR neural network python

Answers (1)

Related Questions