None Gradients for a model with 2 outputs

Question

I have a model that has a GRU implementation inside and process audio samples. In each forward path I process a single sample of an audio file. To imitate the GRU behavior correctly, I have returned the output of my GRU implementation in each forward pass and pass that data alongside other inputs to my model in the next forward calculation as the initiation value of GRU implementation. So, I don't use that output in my Loss function.

To calculate gradients I am using tf.GradientTape().gradient but it returns all None for all variables. Can the fact that one of my outputs hasn't been used in loss calculation be the source of these None gradients?

Following is a schematic of my training loop:

for epoch in epochs: 
        for batch in dataset:
            with tf.GradientTape() as tape:
                for audio in batch:
                    for sample in audio:
                        primary_output,gru_next  = my_model([other_inputs, previous_gru_output], training=True)
                        stacked_primary_outputs[sample] = primary_output
                        previous_gru_output = gru_next  
                    enhanced_audio = create_the_output_audio_by_accumulating_primary_output(stacked_primary_outputs)
                    single_audio_loss = my_loss_function(clean_audio, enhanced_audio)
                    total_loss += single_audio_loss 
                grads = tape.gradient(total_loss, my_model.trainable_weights)
                optimizer.apply_gradients(zip(grads, my_model.trainable_weights))

None Gradients for a model with 2 outputs

Answers (0)

Related Questions