Training with tf.data API and sample weights

Question

All my training images are in tfrecords files. Now they are used in a standard way like this:

dataset = dataset.apply(tf.data.experimental.map_and_batch(
            map_func=lambda x: preprocess(x, data_augmentation_options=data_augmentation), 
            batch_size=images_per_batch)

where preprocess returns the decoded image and the label which both come from the tfrecord file.

Now the new situation. I want also a sample weight for each example. So instead of

return image,label

in preprocess, it should be

return image, label, sample_weight

However, this sample_weight is not in the tfrecord file. It is computed when training start based on number of examples for each class. Basically it is a Python dictionary weights[label] = sample_weights.

The question is how to use these sample weights in the tf.data pipeline. Because label is a Tensor it cannot be used to index the Python dictionary.

Training with tf.data API and sample weights

Answers (1)

Related Questions