Reputation: 437
I have implemented seq2seq translation model in Tensorflow 2.0
But during training I get the following error:
ValueError: Shapes (2056, 10, 10000) and (1776, 10, 10000) are incompatible
I have 10000 records in my dataset. Starting from the first record untill 8224 records dimensions matches. But for the last 1776 records I get the error mentioned above just because my batch_size is bigger than remaining number of records. Here is my code:
max_seq_len_output = 10
n_words = 10000
batch_size = 2056
model = Model_translation(batch_size = batch_size,embed_size = embed_size,total_words = n_words , dropout_rate = dropout_rate,num_classes = n_words,embedding_matrix = embedding_matrix)
dataset_train = tf.data.Dataset.from_tensor_slices((encoder_input,decoder_input,decoder_output))
dataset_train = dataset_train.shuffle(buffer_size = 1024).batch(batch_size)
loss_object = tf.keras.losses.CategoricalCrossentropy()#used in backprop
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
train_loss = tf.keras.metrics.Mean(name='train_loss')#mean of the losses per observation
train_accuracy =tf.keras.metrics.CategoricalAccuracy(name='train_accuracy')
##### no @tf.function here
def training(X_1,X_2,y):
#creation of one-hot-encoding, because if I would do it out of the loop if would have RAM problem
y_numpy = y.numpy()
Y = np.zeros((batch_size,max_seq_len_output,n_words),dtype='float32')
for i, d in enumerate(y_numpy):
for t, word in enumerate(d):
if word != 0:
Y[i, t, word] = 1
Y = tf.convert_to_tensor(Y)
#predictions
with tf.GradientTape() as tape:#Trainable variables (created by tf.Variable or tf.compat.v1.get_variable, where trainable=True is default in both cases) are automatically watched.
predictions = model(X_1,X_2)
loss = loss_object(Y,predictions)
gradients = tape.gradient(loss,model.trainable_variables)
optimizer.apply_gradients(zip(gradients,model.trainable_variables))
train_loss(loss)
train_accuracy(Y,predictions)
del Y
del y_numpy
EPOCHS = 70
for epoch in range(EPOCHS):
for X_1,X_2,y in dataset_train:
training(X_1,X_2,y)
template = 'Epoch {}, Loss: {}, Accuracy: {}'
print(template.format(epoch+1,train_loss.result(),train_accuracy.result()*100))
# Reset the metrics for the next epoch
train_loss.reset_states()
train_accuracy.reset_states()
How can I fixt this problem?
Upvotes: 1
Views: 221
Reputation: 1196
One solution would be to drop the remainder during batching with
dataset_train = dataset_train.shuffle(buffer_size = 1024).batch(batch_size, drop_remainder=True)
Upvotes: 1