Ken Krige
Ken Krige

Reputation: 106

Mixing datasets in set ratio

In tensorlfow dataset, how do I mix 2 datasets, taking 75% of the set from my original data and 25% from the augmented data?

d = tf.data.Dataset.list_files("raw_data/")\
    .flat_map(tf.data.TFRecordDataset)
ad = tf.data.Dataset.list_files("augmented_data/")\
    .flat_map(tf.data.TFRecordDataset)

Upvotes: 0

Views: 101

Answers (1)

Sharky
Sharky

Reputation: 4533

The problem is you can't use len() on a dataset object, so it's sometimes hard to know exact number of examples until you iterate a full epoch. But you can approximate this with take and skip methods.

train_dataset = dataset.take(number_examples_for_train)
test_dataset = dataset.skip(number_examples_for_train)

Those methods are a direct alternative to each other. https://www.tensorflow.org/api_docs/python/tf/data/Dataset#take

Upvotes: 1

Related Questions