Sneha Sridharan
Sneha Sridharan

Reputation: 165

Explaination for what tensorflow.keras.dataset.minst.load_data() returns

I came across the statement:

(x_train, y_train), (x_test, y_test) = mnist.load_data()

and its corresponding explanation for what it returns:

Returns: 2 tuples: x_train, x_test: uint8 array of grayscale image data with shape (num_samples, 28, 28). y_train, y_test: uint8 array of digit labels (integers in range 0-9) with shape (num_samples,).

My doubt here is that whether x_train, x_test, y_train or y_test is itself a tuple that holds the values (num_sample, 28, 28) and (num_sample) respectively? and the tuple x_train, x_test is actually a tuple of tuple ?

I am new to this topic, so I am sorry if I am asking very silly questions! If anyone out there has an explanation for this, please write back.

Upvotes: 3

Views: 822

Answers (2)

ruohola
ruohola

Reputation: 24087

From the docs:

tf.keras.datasets.mnist.load_data

Returns:
Tuple of Numpy arrays: (x_train, y_train), (x_test, y_test).

But this wording is kind of confusing, the actual return type of the function is a tuple of two tuples with two numpy arrays each:

Tuple[Tuple[np.ndarray, np.ndarray], Tuple[np.ndarray, np.ndarray]]

Upvotes: 1

Thomas Schillaci
Thomas Schillaci

Reputation: 2453

Let's look at the shapes of those objects:

(x_train, y_train), (x_test, y_test) = mnist.load_data()

print(np.shape(x_train))
print(np.shape(x_test))
print(np.shape(y_train))
print(np.shape(x_test))

(60000, 28, 28)

(10000, 28, 28)

(60000,)

(10000,)

You x_* datasets contain respectively 60000 and 10000 matrices of 28*28 pixels encoded as ints between 0 and 255.

Your y_* dataset contain the labels of what number is represented in your corresponding 28*28 pixels matrices.

Upvotes: 3

Related Questions