What is the structure of the data and labels in tensorflow.examples.tutorials.mnist input_data

Question

I'm trying to learn to introduce data to conv nets properly in Tensorflow, and a majority of example code uses from import tensorflow.examples.tutorials.mnist import input_data.

It's simple when you can use this to access mnist data, but not helpful when trying to establish the equivalent way to structure and introduce non-mnist data to similar models.

What is the structure of the data being imported through the mnist examples, so that I can use example cnn walkthrough code and manipulate my data to mirror the structure of the mnist data?

mrry · Accepted Answer

The format of the MNIST data obtained from that example code depends on exactly how you initialize the DataSet class. Calling DataSet.next_batch(batch_size) returns two NumPy arrays, representing batch_size images and labels respectively. They have the following formats.

If the DataSet was initialized with reshape=True (the default), the images array is a batch_size by 784 matrix, in which each row contains the pixels of one MNIST image. The default type is tf.float32, and the values are pixel intensities between 0.0 and 1.0.
If the DataSet was initialized with reshape=False, the images array is batch_size by 28 by 28 by 1 4-dimensional tensor. The 28 corresponds to the height and width of each image in pixels; the 1 corresponds to the number of channels in the images, which are grayscale and so have only a single channel.
If the DataSet was initialized with one_hot=False (the default), the labels array is a vector of length batch_size, in which each value is the label (an integer from 0 to 9) representing the digit in the respective image.
If the DataSet was initialized with one_hot=True, the labels array is a batch_size by 10 matrix, in which each row is all zeros, except for a 1 in the column that corresponds to the label of the respective image.

Note that if you are interested in convolutional networks, initializing the DataSet with reshape=False is probably what you want, since that will retain spatial information about the images that will be used by the convolutional operators.

What is the structure of the data and labels in tensorflow.examples.tutorials.mnist input_data

Answers (1)

Related Questions