Reputation: 9
I'm write a Planar data classification program with one hidden layer from a Coursera course.
What should this code do and why doesn't it work?
def backward_propagation(parameters, cache, X, Y):
"""
Implement the backward propagation using the instructions above.
"""
m = X.shape[1]
# First, retrieve W1 and W2 from the dictionary "parameters".
### START CODE HERE ### (≈ 2 lines of code)
W1 = parameters["W1"]
W2 = parameters["W2"]
### END CODE HERE ###
# Retrieve also A1 and A2 from dictionary "cache".
### START CODE HERE ### (≈ 2 lines of code)
A1 = cache["A1"]
A2 = cache["A1"]
### END CODE HERE ###
# Backward propagation: calculate dW1, db1, dW2, db2.
### START CODE HERE ### (≈ 6 lines of code, corresponding to 6 equations on slide above)
dZ2= A2-Y
dW2 = (1/m)*np.dot(dZ2,A1.T)
db2 = (1/m)*np.sum(dZ2, axis=1, keepdims=True)
dZ1 = np.multiply(np.dot(W2.T, dZ2),1 - np.power(A1, 2))
dW1 = (1 / m) * np.dot(dZ1, X.T)
db1 = (1/m)*np.sum(dZ1,axis1,keepdims=True)
### END CODE HERE ###
grads = {"dW1": dW1,
"db1": db1,
"dW2": dW2,
"db2": db2}
return grads
parameters, cache, X_assess, Y_assess = backward_propagation_test_case()
grads = backward_propagation(parameters, cache, X_assess, Y_assess)
print ("dW1 = "+ str(grads["dW1"]))
print ("db1 = "+ str(grads["db1"]))
print ("dW2 = "+ str(grads["dW2"]))
print ("db2 = "+ str(grads["db2"]))
When I run this code, I get this error:
ValueError: shapes (4,1) and (4,3) not aligned: 1 (dim 1) != 4 (dim 0)
Upvotes: 1
Views: 24123
Reputation: 394
Replace dZ1 = np.multiply(np.dot(W2.T, dZ2),1
with dZ1 = np.multiply(np.dot(W2.T, dZ2),1)
,i.e, add closing bracket at the end for np.multiply . And also replace A2=cache["A1"]
with A2=cache["A2"]
Upvotes: 0
Reputation: 771
try re oreder your metrices your code:
dW1 = (1 / m) * np.dot(dZ1, X.T)
try this code:
enter code here
dW1 = (1 / m) * np.dot( X.T,dZ1)
Upvotes: 0
Reputation: 1
You have initialized A2=cache["A1"]
but it should be A2=cache["A2"]
.
Upvotes: 0
Reputation: 360
When multiplying two matrices i.e., np.dot. The column of the first matrix and the row of the second matrix should be equal. That's what numpy is throwing error. You can't multiply a 4x1 matrix with a 4x3 matrix.
Upvotes: 5