Reputation: 31
For the same input and label:
pytorch.nn.CTCLoss
is 5.74, tf.nn.ctc_loss
is 129.69, math.log(tf ctc loss)
is 4.86So what's the difference between pytorch.nn.CTCLoss
with tf.nn.ctc_loss
?
tf: 1.13.1
pytorch: 1.1.0
I had try to these:
log_softmax
the input, and then send it to pytorch.nn.CTCLoss
,
tf.nn.log_softmax
the input, and then send it to tf.nn.ctc_loss
directly send the input to tf.nn.ctc_loss
directly send the input to tf.nn.ctc_loss
, and then math.log(output of tf.nn.ctc_loss)
In the case 2, case 3, and case 4, the result of calculation is difference from pytorch.nn.CTCLoss
from torch import nn
import torch
import tensorflow as tf
import math
time_step = 50 # Input sequence length
vocab_size = 20 # Number of classes
batch_size = 16 # Batch size
target_sequence_length = 30 # Target sequence length
def dense_to_sparse(dense_tensor, sequence_length):
indices = tf.where(tf.sequence_mask(sequence_length))
values = tf.gather_nd(dense_tensor, indices)
shape = tf.shape(dense_tensor, out_type=tf.int64)
return tf.SparseTensor(indices, values, shape)
def compute_loss(x, y, x_len):
ctclosses = tf.nn.ctc_loss(
y,
tf.cast(x, dtype=tf.float32),
x_len,
preprocess_collapse_repeated=False,
ctc_merge_repeated=False,
ignore_longer_outputs_than_inputs=False
)
ctclosses = tf.reduce_mean(ctclosses)
with tf.Session() as sess:
ctclosses = sess.run(ctclosses)
print(f"tf ctc loss: {ctclosses}")
print(f"tf log(ctc loss): {math.log(ctclosses)}")
minimum_target_length = 10
ctc_loss = nn.CTCLoss(blank=vocab_size - 1)
x = torch.randn(time_step, batch_size, vocab_size) # [size] = T,N,C
y = torch.randint(0, vocab_size - 2, (batch_size, target_sequence_length), dtype=torch.long) # low, high, [size]
x_lengths = torch.full((batch_size,), time_step, dtype=torch.long) # Length of inputs
y_lengths = torch.randint(minimum_target_length, target_sequence_length, (batch_size,),
dtype=torch.long) # Length of targets can be variable (even if target sequences are constant length)
loss = ctc_loss(x.log_softmax(2).detach(), y, x_lengths, y_lengths)
print(f"torch ctc loss: {loss}")
x = x.numpy()
y = y.numpy()
x_lengths = x_lengths.numpy()
y_lengths = y_lengths.numpy()
x = tf.cast(x, dtype=tf.float32)
y = tf.cast(dense_to_sparse(y, y_lengths), dtype=tf.int32)
compute_loss(x, y, x_lengths)
I expect the output of tf.nn.ctc_loss
is same with the output of pytorch.nn.CTCLoss
, but actually they are not, but how can i make them same?
Upvotes: 3
Views: 1879
Reputation: 345
The automatic mean reduction of the CTCLoss of pytorch is not the same as computing all the individual losses, and then doing the mean (as you are doing in the Tensorflow implementation). Indeed from the doc of CTCLoss (pytorch):
``'mean'``: the output losses will be divided by the target lengths and
then the mean over the batch is taken.
To obtain the same value:
1- Change the reduction method to sum:
ctc_loss = nn.CTCLoss(reduction='sum')
2- Divide the loss computed by the batch_size:
loss = ctc_loss(x.log_softmax(2).detach(), y, x_lengths, y_lengths)
loss = (loss.item())/batch_size
3- Change the parameter ctc_merge_repeated of Tensorflow to True (I am assuming it is the case in the pytorch CTC as well)
ctclosses = tf.nn.ctc_loss(
y,
tf.cast(x, dtype=tf.float32),
x_len,
preprocess_collapse_repeated=False,
ctc_merge_repeated=True,
ignore_longer_outputs_than_inputs=False
)
You will now get very close results between the pytorch loss and the tensorflow loss (without taking the log of the value). The small difference remaining probably comes from slight differences in between the implementations. In my last three runs, I got the following values:
pytorch loss : 113.33 vs tf loss = 113.52
pytorch loss : 116.30 vs tf loss = 115.57
pytorch loss : 115.67 vs tf loss = 114.54
Upvotes: 5