Reputation: 2285
Network:
Input sequence -> BiLSTM---------> BiLSTM --------> Dense with softmax
Output shapes: (None, 5, 256) (None, 5, 128) (None, 5, 11)
Here is my CTC loss:
def calculate_ctc_loss(y_true, y_pred):
batch_length = tf.cast(tf.shape(y_true)[0], dtype="int64")
input_length = tf.cast(tf.shape(y_pred)[1], dtype="int64")
label_length = tf.cast(tf.shape(y_true)[1], dtype="int64")
input_length = input_length * tf.ones(shape=(batch_length, 1), dtype="int64")
label_length = label_length * tf.ones(shape=(batch_length, 1), dtype="int64")
loss = tf.keras.backend.ctc_batch_cost(y_true, y_pred, input_length, label_length)
return loss
There are 10 classes in total. For the first batch with a batch size of 16 the shapes are:
y_true: (16, 7)
y_pred: (16, 5, 11)
I tried to pad the time demesnion in y_pred
so that the shape is (16, 7, 11)
but the loss turned nan
.
Ques: How to correctly pad the time dimension in this case so that y_true
and y_pred
have compatible shapes for CTC calculation?
Upvotes: 0
Views: 106