Jacob
Jacob

Reputation: 161

How to add two tensors with different shapes in python or tensorflow

I have two different shapes of tensors generated by two models. when I print them it's like below

caption loss is (2, 128)
image loss is (128, 128)

One tensor shape is (2, 128) and the other one shape is (128, 128). The code part of these two models is below

captions_loss = keras.losses.kl_divergence(
        y_true=targets, y_pred=logits, #from_logits=True
    )

images_loss = keras.losses.kl_divergence(
        y_true=tf.transpose(targets), y_pred=tf.transpose(logits), #from_logits=True
    )

When I add these two like below then it throws an error.

return (captions_loss + images_loss) / 2

Is there any solution to add these two

captions_loss = (2, 128)
images_loss = (128, 128)

Upvotes: 1

Views: 657

Answers (2)

AloneTogether
AloneTogether

Reputation: 26718

Tensors are generally also broadcastable. You can try a few options and see how they affect model performance:

import tensorflow as tf

captions_loss =  tf.random.normal((2, 128))
images_loss = tf.random.normal((128, 128))

# Option 1:
(tf.reduce_sum(captions_loss, axis=0) + images_loss) / 2

# Option 2:
(tf.reduce_mean(captions_loss, axis=0) + images_loss) / 2

# Option 3:
(captions_loss[0, :] + images_loss + captions_loss[1, :]) / 2

Upvotes: 2

MangoNrFive
MangoNrFive

Reputation: 1599

If you convert your matrices to numpy-arrays, you can take advantage of numpys broadcasting to compatible shapes:

import numpy as np


A = np.array([
    [10, 10, 10]
])

B = np.array([
    [2, 2, 2],
    [3, 3, 3],
    [1, 1, 1],
])

print(A.shape)
print(B.shape)
print(A + B)

Output:

(1, 3)
(3, 3)
[[12 12 12]
 [13 13 13]
 [11 11 11]]

Upvotes: 1

Related Questions