estebarb
estebarb

Reputation: 475

TensorFlow not training on all inputs

I'm trying to train a TF model on 5774. But it gets stuck on 96 examples and justs jumps to the next epoch, ignoring most of the examples. Why is TF showing that behaviour and how can it be fixed?

model.compile(
    optimizer='rmsprop',
    loss='categorical_crossentropy',
    metrics=['acc']
    )

callback = tf.keras.callbacks.EarlyStopping(monitor='acc', patience=50)
history = model.fit(
    x=[train_ids, train_masks, train_segments],
    y=train_y,
    batch_size=32,
    epochs=10000,
    verbose=1,
    callbacks=[callback]
    )

Output:

Train on 5774 samples
Epoch 1/10000
  96/5774 [..............................] - ETA: 15:33 - loss: 1.9542 - acc: 0.2917Epoch 2/10000
  96/5774 [..............................] - ETA: 3:26 - loss: 1.6615 - acc: 0.5417Epoch 3/10000
  96/5774 [..............................] - ETA: 3:27 - loss: 4.9110 - acc: 0.2917Epoch 4/10000
  96/5774 [..............................] - ETA: 3:26 - loss: 1.8811 - acc: 0.2500Epoch 5/10000
  96/5774 [..............................] - ETA: 3:27 - loss: 2.0512 - acc: 0.3229Epoch 6/10000
  96/5774 [..............................] - ETA: 3:27 - loss: 1.3690 - acc: 0.4167Epoch 7/10000
  96/5774 [..............................] - ETA: 3:28 - loss: 1.4500 - acc: 0.3854Epoch 8/10000
  96/5774 [..............................] - ETA: 3:27 - loss: 1.2867 - acc: 0.3958Epoch 9/10000
  96/5774 [..............................] - ETA: 3:27 - loss: 1.3947 - acc: 0.3333Epoch 10/10000
  96/5774 [..............................] - ETA: 3:27 - loss: 1.6012 - acc: 0.1979Epoch 11/10000
  96/5774 [..............................] - ETA: 3:27 - loss: 1.4505 - acc: 0.4271Epoch 12/10000
  96/5774 [..............................] - ETA: 3:26 - loss: 1.5062 - acc: 0.2500Epoch 13/10000
  96/5774 [..............................] - ETA: 3:27 - loss: 1.4980 - acc: 0.3333Epoch 14/10000

Upvotes: 0

Views: 52

Answers (1)

estebarb
estebarb

Reputation: 475

In my case train_ids, train_masks and train_segments were a list of n np.arrays with shape (96,). After forcing fit with steps_per_epoch=5774//32 it showed the right message error: that the inputs just have 96 samples, although in the logs says 5774.

Casting the lists to np.array did the trick, although I think that there is an error in tensorflow logs anyway.

Upvotes: 1

Related Questions