Matt
Matt

Reputation: 2329

fmin_cg function usage for minimizing neural network cost function

I am trying to port some of my code from MatLab into Python and am running into problems with scipy.optimize.fmin_cg function - this is the code I have at the moment:

My cost function:

def nn_costfunction2(nn_params,*args):
    Theta1, Theta2 = reshapeTheta(nn_params)

    input_layer_size, hidden_layer_size, num_labels, X, y, lam = args[0], args[1], args[2], args[3], args[4], args[5]   

    m = X.shape[0] #Length of vector
    X = np.hstack((np.ones([m,1]),X)) #Add in the bias unit

    layer1 = sigmoid(Theta1.dot(np.transpose(X))) #Calculate first layer
    layer1 = np.vstack((np.ones([1,layer1.shape[1]]),layer1)) #Add in bias unit
    layer2 = sigmoid(Theta2.dot(layer1))

    y_matrix = np.zeros([y.shape[0],layer2.shape[0]]) #Create a matrix where vector position of one corresponds to label
    for i in range(y.shape[0]):
        y_matrix[i,y[i]-1] = 1

    #Cost function
    J = (1/m)*np.sum(np.sum(-y_matrix.T.conj()*np.log(layer2),axis=0)-np.sum((1-y_matrix.T.conj())*np.log(1-layer2),axis=0))
    #Add in regularization
    J = J+(lam/(2*m))*np.sum(np.sum(Theta1[:,1:].conj()*Theta1[:,1:])+np.sum(Theta2[:,1:].conj()*Theta2[:,1:]))

    #Backpropagation with vectorization and regularization
    delta_3 = layer2 - y_matrix.T
    r2 = delta_3.T.dot(Theta2[:,1:])
    z_2 = Theta1.dot(X.T)
    delta_2 = r2*sigmoidGradient(z_2).T
    t1 = (lam/m)*Theta1[:,1:]
    t1 = np.hstack((np.zeros([t1.shape[0],1]),t1))
    t2 = (lam/m)*Theta2[:,1:]
    t2 = np.hstack((np.zeros([t2.shape[0],1]),t2))
    Theta1_grad = (1/m)*(delta_2.T.dot(X))+t1
    Theta2_grad = (1/m)*(delta_3.dot(layer1.T))+t2

    nn_params = np.hstack([Theta1_grad.flatten(),Theta2_grad.flatten()]) #Unroll parameters

    return nn_params

My call of the function:

args = (input_layer_size, hidden_layer_size, num_labels, X, y, lam)
fmin_cg(nn_costfunction2,nn_params, args=args,maxiter=50)

Gives the following error:

  File "C:\WinPython3\python-3.3.2.amd64\lib\site-packages\scipy\optimize\optimize.py", line 588, in approx_fprime
    grad[k] = (f(*((xk+d,)+args)) - f0) / d[k]

ValueError: setting an array element with a sequence.

I tried various permutations in passing arguments to fmin_cg but this is the farthest I got. Running the cost function on its own does not throw any errors in this form.

Upvotes: 0

Views: 892

Answers (3)

Pradi KL
Pradi KL

Reputation: 726

I see this issue is due to the fact you let nnCostFunction2 return cost and grad.

But the scipy.optimize.fmin_cg function will only take single cost output of nnCostFunction2.

So retain single J or cost output from nnCostFunction2 function.

this is my function which is working:

scipy.optimize.fmin_cg(nnCostFunction, initial_rand_theta, backpropagate, \ args=(hidden_s, input_s, num_labels, X, y, lamb), maxiter=1000, \ disp=True, full_output=True)

Upvotes: 0

Lerok
Lerok

Reputation: 11

Try to add epsilon argument in function call:

fmin_cg(nn_costfunction2,nn_params, args=args,epsilon,maxiter=50)

Upvotes: 1

lennon310
lennon310

Reputation: 12699

The input variable in cost function should be an 1D array. So your Theta1 and Theta2 in J have to be derived from nn_params. And you need to return J as well.

Upvotes: 1

Related Questions