How do I train the NN layers when the loss function is doing a derivative of the NN

Question

To give you some context, I am training a neural network that learn a Hamiltonian. To do so, I must use a customized neural network like the following.

class HNN(keras.Model):
  def __init__(self, input_dim=2, hidden_dim=200):
    super(HNN, self).__init__()
    self.dense1 = tf.keras.layers.Dense(hidden_dim, activation='tanh')
    self.dense2 = tf.keras.layers.Dense(hidden_dim, activation='tanh')
    self.dense3 = tf.keras.layers.Dense(1)
    O = np.zeros((input_dim//2, input_dim//2))
    I = np.identity(input_dim//2)
    M = np.concatenate([np.concatenate((O, I), axis=1), np.concatenate((-I, O), axis=1)], axis=0)
    self.M = tf.constant(M, dtype='double')

  def call(self, x):
    y = self.dense1(x)
    y = self.dense2(y)
    y = self.dense3(y)
    return y

  def forward(self, x):
    with tf.GradientTape() as tape:
        y = self.dense1(x)
        y = self.dense2(y)
        y = self.dense3(y)
    y = tape.gradient(y, x)
    y = self.M @ y
    return y

This neural network takes a (2, batch_size) input (canonical coordinates) anr return the the symplectic gradient of itself. To put it simply, this gradient is the same as the classical gradient but it is rotated 90 deg through M so that it points to where the hamiltonian is constant (Energy conserved). Now the thing is the Loss Function has this form

Equation for the Loss Function

Where in this case H == NN. So thats why 'forward' has that form with the tape gradient. Finally, my training function is this one

def train_HNN(data, learning_rate = 1e-3, epochs = 200):
    model = HNN(input_dim=data[['q', 'p']].shape[1], hidden_dim=HIDDEN_DIM)
    loss_func = tf.keras.losses.MeanSquaredError()
    optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
    for i in range(epochs):
        with tf.GradientTape() as t:
            t.watch(model.trainable_variables)
            predictions = model.forward(tf.Variable(tf.stack(data[['q', 'p']])))
            loss = loss_func(tf.Variable(tf.stack(data[['dq', 'dp']])), predictions)
        gradients = t.gradient(loss, model.trainable_variables)
        optimizer.apply_gradients(zip(gradients, model.trainable_variables))

        print (i, loss)
    return model

As you can see, my Loss function does not take explicitly the weigths as the thing it has to train.

My dataset is a pandas dataframe of the form

q p dqdt dpdt
float float float float
float float float float
float float float float
....

Now, the issue here is that I am getting the following warning.

WARNING:tensorflow:Gradients do not exist for variables ['dense_8/bias:0'] when minimizing the loss. If you're using `model.compile()`, did you forget to provide a `loss` argument?

However, it is working as expected Some graphs that show the result My neural network preserves energy as expected.

How can I fix this warning? Why is my code working anyway.

How do I train the NN layers when the loss function is doing a derivative of the NN

Answers (0)

Related Questions