Reputation: 4038
I am trying to follow the course here http://cs231n.github.io/optimization-1/ , in the section Computing the gradient numerically with finite differences, they have provided a code snippet that should computer the gradient given a function and an array. I tried to run this using my own function and numpy array as an input and I get the following error:
ValueError Traceback (most recent call last)
<ipython-input-18-31c1f1d6169c> in <module>()
2 return a
3
----> 4 eval_numerical_gradient(f,np.array([1,2,3,4,5]))
<ipython-input-12-d6bea4220895> in eval_numerical_gradient(f, x)
28 print(x[ix])
29 # compute the partial derivative
---> 30 grad[ix] = (fxh - fx) / h # the slope
31 it.iternext() # step to next dimension
32
ValueError: setting an array element with a sequence.
I understand the error is because it cannot assign grad[ix] a sequence, I also tried with a column array and got the same error.
Here is the code:
def eval_numerical_gradient(f, x):
"""
a naive implementation of numerical gradient of f at x
- f should be a function that takes a single argument
- x is the point (numpy array) to evaluate the gradient at
"""
fx = f(x) # evaluate function value at original point
print(x)
print(fx)
grad = np.zeros(x.shape)
h = 0.00001
# iterate over all indexes in x
it = np.nditer(x, flags=['multi_index'], op_flags=['readwrite'])
while not it.finished:
print(it)
# evaluate function at x+h
ix = it.multi_index
print(ix)
old_value = x[ix]
print(old_value)
x[ix] = old_value + h # increment by h
print(x)
fxh = f(x) # evalute f(x + h)
print(fxh)
x[ix] = old_value # restore to previous value (very important!)
print(x[ix])
# compute the partial derivative
grad[ix] = (fxh - fx) / h # the slope
it.iternext() # step to next dimension
return grad
My question is: is my input of a numpy array (row and column) wrong? Can somebody explain why this is happening?
Sample input :
def f(a):
return a
eval_numerical_gradient(f,np.array([[1],[2],[3]]))
and
def f(a):
return a
eval_numerical_gradient(f,np.array([1,2,3]))
Upvotes: 1
Views: 358
Reputation: 578
I suggest the following fix for eval_numerical_gradient(f, x)
:
fxh = f(x)
with fxh = f(x[ix])
grad[ix] = (fxh - fx) / h
with grad[ix] = (fxh - fx[ix]) / h
And make your input matrix with float number entries, e.g.,
eval_numerical_gradient(f,np.array([[1],[2],[3]], dtype=np.float))
Upvotes: 2