3voC
3voC

Reputation: 697

Tensorflow: very large estimator's logs with Dataset.from_tensor_slices()

I've been studying mnist estimator code (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/layers/cnn_mnist.py) and after training or 150 000 steps with this code, logs produced by an estimator have 31M in size. (13M for each weight checkpoint and 5M for graph definition).

While tinkering with code I wrote my own train_input_fn using tf.data.Dataset.from_tensor_slices(). My code here:

def my_train_input_fn():
    mnist = tf.contrib.learn.datasets.load_dataset("mnist")
    images = mnist.train.images  # Returns np.array
    labels = np.asarray(mnist.train.labels, dtype=np.int32)
    dataset = tf.data.Dataset.from_tensor_slices(
        ({"x": images}, labels))
    dataset = dataset.shuffle(50000).repeat().batch(100)

    return dataset

And, my logs, even before one step of the training, only after graph initalization, had size over 1,5G! (165M for ckpt-meta, around 600M for each events.out.tfevents and for graph.pbtxt files).

After a little research I found out that the function from_tensor_slices() is not appropriate for larger datasets, because it creates constants in the execution graph.

Note that the above code snippet will embed the features and labels arrays in your TensorFlow graph as tf.constant() operations. This works well for a small dataset, but wastes memory---because the contents of the array will be copied multiple times---and can run into the 2GB limit for the tf.GraphDef protocol buffer.

source: https://www.tensorflow.org/programmers_guide/datasets

But mnist dataset has only around 13M in size. So why my graph definition has 600M, not only those additional 13M embedded as constants? And why events file is so big?

The original dataset producing code (https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/python/estimator/inputs/numpy_io.py) doesn't produce such large logs files. I guess it is because of queues usage. But now queues are deprecated and we should use tf.Dataset instead of queues, right? What is the correct method of creating such dataset from file containing images (not from TFRecord)? Should I use tf.data.FixedLengthRecordDataset?

Upvotes: 3

Views: 533

Answers (1)

ricvo
ricvo

Reputation: 93

I had a similar issue, I solved using tf.data.Dataset.from_generator or tf.data.Dataset.range and then dataset.map to get the particular value.

E.g. with generator

def generator():
   for sample in zip(*datasets_tuple):
     yield sample

dataset = tf.data.Dataset.from_generator(generator,
       output_types=output_types, output_shapes=output_shapes)

Upvotes: 2

Related Questions