Reputation: 89
I am reading a book about Deep Learning and I am currently learning about Keras functional API in it. In the context:
"The input layer takes a shape argument that is a tuple that indicates the dimensionality of the input data. When input data is one-dimensional, such as for a Multilayer Perceptron, the shape must explicitly leave room for the shape of the minibatch size used when splitting the data when training the network. Therefore, the shape tuple is always defined with a hanging last dimension (2,), this is the way you must define a one-dimensional tuple in Python, for example:"
I did not quite understand the shape part - why is the second parameter left empty? What does keeping it empty mean? None means that it could take any shape but what is happening here? Also, about the mini batch size - isn't only one data processed at a time in a NN and with mini batch - we update the learning rate (if using sgd) after every batch of data gets evaluated with our model. Then why do we need to change the dimension of our input shape to accommodate this? - shouldn't only one data instance go at a time?
Upvotes: 2
Views: 2087
Reputation: 6156
If your data was two-dimensional (e.g. a greyscale image) then the numpy array would be of shape (height, width)
, for example. With a one-dimensional input though, you might be tempted to say its shape is just length
. When you say (length,)
instead, the difference is that you have not an integer, but a tuple with one element.
The idea about batches is that multiple are processed at once to speed training up. How exactly that works internally, I am not sure of, but oftentimes you actually have more than one instance in a batch. I believe the gradient descent simply does not update the weights between each instance and instead it is only updated after each batch - which means that every instance in a batch can be computed in parallel.
My guess why they point out that the shape being a tuple is relevant is that there is no special case handling for when the shape is just an integer. E.g. you can loop over a tuple's entries, but not over an integer.
Notice also that the shape of a numpy array is also a tuple:
>>> import numpy as np
>>> np.array([1,2,3]).shape
(3,)
so you can use array.shape
directly if you wish to do so.
Technically, you could use batches but set the batch size to 1
. That can be confusing though, because if you use squeeze
somewhere, it will get rid of the batch dimension as well.
Upvotes: 2