tsf44
tsf44

Reputation: 13

How to use Encoder predictions for additional loss and gradient calculations (Tensorflow)

Problem

I'm having troubles correctly adding physics-informed losses to my training code for my neural network.

Background

I have an encoder that takes an input curve, X(w), where w is an independent variable (not included in the encoder input), to predict parameters, y. The parameters y can be inferred from X(w) and once known can also be used to calculate X(w) given w. Since the parameters y are known, I'm taking a supervised learning approach to train my encoder like so:

Data-driven Supervised Learning
Note: train_step() is one method from a custom tf.keras.models.Model class not included here.
For simplicity, assume that X has a shape of (N, 1000), and y has a shape of (N, 2), where N is the number of input batches:

def train_step(self, X, y):
    # Define object to compute loss
    loss_object = tf.keras.losses.MeanSquaredError()

    # Define tape for gradient calculations
    with tf.GradientTape as tape:
        # Get predicted encoder results
        # X -> (N, 1000)
        y_pred = self.encoder(X)  # y, y_pred -> (N, 2)

        # Calculate data-based loss
        total_loss = loss_object(y, y_pred)

    # Compute the gradients from the total_loss
    grads = tape.gradient(total_loss, self.trainable_weights)

    """Apply gradients to optimizer, etc."""

Desired Result

I can theoretically recalculate X(w) from y_pred (given that w is known and constant for all inputs of X) with some known physics, and could define another class method that looks something like this:

def calc_X(w, y0, y1):
    """Known equation here"""
    return X

Because I have an actual equation to reconstruct X(w) from y, I figured I could add to the total loss by comparing X_pred to the input, X, (similar to an autoencoder), then calculate the gradients like so:

def train_step(self, X, y):
    # Define object to compute loss
    loss_object = tf.keras.losses.MeanSquaredError()

    # Define tape for gradient calculations
    with tf.GradientTape as tape:
        # Get predicted encoder results
        # X -> (N, 1000)
        y_pred = self.encoder(X)  # y, y_pred -> (N, 2)

        # Get reconstructed input from y_pred
        w = tf.constant(np.arange(0, 1000))
        y0 = y_pred.numpy()[:, 0]
        y1 = y_pred.numpy()[:, 1]
        X_pred = calc_X(w, y0, y1)

        # Define losses
        data_loss = loss_object(y, y_pred)
        reconstruction_loss = loss_object(X, X_pred)
        total_loss = data_loss + reconstruction_loss

    # Calculate gradients
    grads = tape.gradient(total_loss, self.trainable_weights)

    """Apply gradients to optimizer, etc."""

Complications and Question

When training with my complete code, I've observed that the calculated gradients are identical between the case using purely data-driven loss and the case where the reconstruction_loss is included. The values of total_loss are different between the two cases, yet the gradients remain the same.

How do I correctly get the reconstruction_loss to contribute to the gradient calculation?

My guess is that the tape from GradientTape is not tracking any of the calculations used to get X_pred, even though y0 and y1 come from the encoder's output. Is it as simple as just calling tape.watch(y0), tape.watch(y1)?

Upvotes: 1

Views: 35

Answers (0)

Related Questions