Reputation: 437
I came across this piece of code
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print("Shape of x_train: " + str(x_train.shape))
print("Shape of y_train: " + str(y_train.shape))
And found that the output looks like this
(60000, 28, 28)
(60000,)
For the first line of output
So far my understanding, does it means that in 1st dimension it can hold 60k items, then in next dimension, it can hold 28 "array of 60k items" and finally, in the last dimension, it can hold 28 "array of 28 "array of 60k items""
What I want to clarify is, Is this 60k samples of 28x28 data or something else?
For the second line of output, it seems like its just a 1d array of 60k items. So what does it actually represents? (i know that in x_train it was handwritten numbers and each number represents the intensity of grey in that cell)
Please note I have taken this code from some online example(i don't remember and won't mind if you want your credit to be added to this) and public dataset
tf.keras.datasets.mnist
Upvotes: 2
Views: 232
Reputation: 186
You are right the first line gives 60K items of 28x28
size data thus (60000, 28, 28)
.
The y_train
are labels of the x_train
. Thus they are a one dimensional and 60k in number.
For example: If the first item of the x_train
is a handwritten image of 3, then the first item of y_train
will be '3' which is the label.
Upvotes: 1
Reputation: 6864
To understand this, let's start with a 1d array of shape (8,).
[1, 2, 3, 4, 5, 6, 7, 8]
If this is represented as a 2d array, say of shape (4, 2), it becomes
[
[1, 2],
[3, 4],
[5, 6],
[7, 8]
]
See every item in the 2d array gets a shape of (2,) and there 4 items totally.
Let's represent in 3d with size (2, 2, 2).
[
[
[1, 2],
[3, 4]
],
[
[5, 6],
[7, 8]
]
]
The array at the top level has 2 items which is the 0th dimension. The second level has 2 items again which are [1, 2], [3, 4]
. The final dimension of size 2
denotes 1 & 2
, the last level of items in the array hierarchy.
Hence a tensor of (x, y, z) shape will contain x*y*z
elements.
Upvotes: 1
Reputation: 4892
Your understanding of the shapes is correct. From the context probably the x_train
are 60k images of handwritten numbers (with resolution 28x28 pixel) and the y_train
are simply the 60k true number, which the images show.
Upvotes: 1