ctiid
ctiid

Reputation: 365

How to get number of samples in a tf.dataset for steps_per_epoch?

i am curious how I can set the steps_per_epoch in tf.keras fit for training on a tf.dataset?. Since I need the number of examples to calculate it I wonder how I get this?

As it is of type tf.data you could assume assume that this is more easier. If I set steps_per_epoch to None I get "unknown".

Why using tf.data makes life so complicated?

Upvotes: 2

Views: 1993

Answers (2)

Timbus Calin
Timbus Calin

Reputation: 14983

The previous answer is good, yet I would like to point out two matters:

  1. The code below works, no need to use the experimental package anymore.
import tensorflow as tf
dataset = tf.data.Dataset.range(42)
#Still prints 42
print(dataset.cardinality().numpy())
  1. If you use the filter predicate, the cardinality may return value -2, hence unknown; if you do use filter predicates on your dataset, ensure that you have calculated in another manner the length of your dataset( for example length of pandas dataframe before applying .from_tensor_slices() on it.

Another important point is how to set the parameters steps_per_epoch and validation_steps : steps_per_epoch == length_of_training_dataset // batch_size, validation_steps == length_of_validation_dataset // batch_size

A full example is available here : How to use repeat() function when building data in Keras?

Upvotes: 4

Vaziri-Mahmoud
Vaziri-Mahmoud

Reputation: 150

Try tf.data.experimental.cardinality:

dataset = tf.data.Dataset.range(42)
print(tf.data.experimental.cardinality(dataset).numpy())

42

Upvotes: 0

Related Questions