Matin Shokri
Matin Shokri

Reputation: 62

How to control differential chain rule in Keras

I have a convolutional neural network with some layers in keras. The last layer in this network is a custom layer that is responsible for sorting some numbers those this layer gets from previous layer, then, the output of custom layer is sent for calculate loss function.

for this purpose (sorting) I use some operator in this layer such as K.argmax and K.gather.

In the back-propagation phase I get error from keras that says:

An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval

that is reasonable cause the involvement of this layer in the derivation process.

Given that my custom layer do not need to corporate in differential chain rule, how can I control differential chain in keras? can I disable this process in custom layer?

Reorder layer that I used in my code is simply following:

def Reorder(args):
    z = args[0]
    l = args[1]
    index = K.tf.argmax(l, axis=1)
    return K.tf.gather(z, index)

Reorder_Layer = Lambda(Reorder, name='out_x')
pred_x = Reorder_Layer([z, op])

Upvotes: 1

Views: 224

Answers (1)

Daniel Möller
Daniel Möller

Reputation: 86600

A few things:

  • It's impossible to train without a derivative, so, there is no solution if you want to train this model
  • It's not necessary to "compile" if you are only going to predict, so you don't need custom derivation rules

If the problem is really in that layer, I suppose that l is computed by the model using trainable layers before it.

If you really want to try this, which doesn't seem a good idea, you can try a l = keras.backend.stop_gradient(args[1]). But this means that absolutely nothing will be trained from l until the beginning of the model. If this doesn't work, then you have to make all layers that produce l have trainable=False before compiling the model.

Upvotes: 1

Related Questions