How to use Encoder predictions for additional loss and gradient calculations (Tensorflow)

Question

Problem

I'm having troubles correctly adding physics-informed losses to my training code for my neural network.

Background

I have an encoder that takes an input curve, X(w), where w is an independent variable (not included in the encoder input), to predict parameters, y. The parameters y can be inferred from X(w) and once known can also be used to calculate X(w) given w. Since the parameters y are known, I'm taking a supervised learning approach to train my encoder like so:

Data-driven Supervised Learning
Note: train_step() is one method from a custom tf.keras.models.Model class not included here.
For simplicity, assume that X has a shape of (N, 1000), and y has a shape of (N, 2), where N is the number of input batches:

def train_step(self, X, y):
    # Define object to compute loss
    loss_object = tf.keras.losses.MeanSquaredError()

    # Define tape for gradient calculations
    with tf.GradientTape as tape:
        # Get predicted encoder results
        # X -> (N, 1000)
        y_pred = self.encoder(X)  # y, y_pred -> (N, 2)

        # Calculate data-based loss
        total_loss = loss_object(y, y_pred)

    # Compute the gradients from the total_loss
    grads = tape.gradient(total_loss, self.trainable_weights)

    """Apply gradients to optimizer, etc."""

Desired Result

I can theoretically recalculate X(w) from y_pred (given that w is known and constant for all inputs of X) with some known physics, and could define another class method that looks something like this:

def calc_X(w, y0, y1):
    """Known equation here"""
    return X

Because I have an actual equation to reconstruct X(w) from y, I figured I could add to the total loss by comparing X_pred to the input, X, (similar to an autoencoder), then calculate the gradients like so:

def train_step(self, X, y):
    # Define object to compute loss
    loss_object = tf.keras.losses.MeanSquaredError()

    # Define tape for gradient calculations
    with tf.GradientTape as tape:
        # Get predicted encoder results
        # X -> (N, 1000)
        y_pred = self.encoder(X)  # y, y_pred -> (N, 2)

        # Get reconstructed input from y_pred
        w = tf.constant(np.arange(0, 1000))
        y0 = y_pred.numpy()[:, 0]
        y1 = y_pred.numpy()[:, 1]
        X_pred = calc_X(w, y0, y1)

        # Define losses
        data_loss = loss_object(y, y_pred)
        reconstruction_loss = loss_object(X, X_pred)
        total_loss = data_loss + reconstruction_loss

    # Calculate gradients
    grads = tape.gradient(total_loss, self.trainable_weights)

    """Apply gradients to optimizer, etc."""

Complications and Question

When training with my complete code, I've observed that the calculated gradients are identical between the case using purely data-driven loss and the case where the reconstruction_loss is included. The values of total_loss are different between the two cases, yet the gradients remain the same.

How do I correctly get the reconstruction_loss to contribute to the gradient calculation?

My guess is that the tape from GradientTape is not tracking any of the calculations used to get X_pred, even though y0 and y1 come from the encoder's output. Is it as simple as just calling tape.watch(y0), tape.watch(y1)?

How to use Encoder predictions for additional loss and gradient calculations (Tensorflow)

Problem

Background

Desired Result

Complications and Question

Answers (0)

Related Questions