what is the meaning of iterations of neural network, gradient descent steps, epoch, batch size?

Could you explain the words below, it really confused me. 1.iterations 2.gradient descent steps 3.epoch 4.batch size.

Upvotes: 3

Answers (2)

Oleg Melnikov

Reputation: 3298

in addition to Sayali's great answer, here are definitions from Keras python package:

Sample: one element of a dataset. Example: one image is a sample in a convolutional network. Example: one audio file is a sample for a speech recognition model
Batch: a set of N samples. The samples in a batch are processed independently, in parallel. If training, a batch results in only one update to the model. A batch generally approximates the distribution of the input data better than a single input. The larger the batch, the better the approximation; however, it is also true that the batch will take longer to process and will still result in only one update. For inference (evaluate/predict), it is recommended to pick a batch size that is as large as you can afford without going out of memory (since larger batches will usually result in faster evaluating/prediction).
Epoch: an arbitrary cutoff, generally defined as "one pass over the entire dataset", used to separate training into distinct phases, which is useful for logging and periodic evaluation.

Upvotes: 0

Sayali Sonawane

Reputation: 12599

In the neural network terminology:

one epoch = one forward pass and one backward pass of all the training examples
batch size = the number of training examples in one forward/backward pass. The higher the batch size, the more memory space you'll need.
number of iterations = number of passes, each pass using [batch size] number of examples. To be clear, one pass = one forward pass + one backward pass (we do not count the forward pass and backward pass as two different passes).

Example: if you have 1000 training examples, and your batch size is 500, then it will take 2 iterations to complete 1 epoch.

Gradient Descent:

Please watch this lecture: https://www.coursera.org/learn/machine-learning/lecture/8SpIM/gradient-descent (Source: Andrew ng, Coursera)

So let's see what gradient descent does. Imagine this is like the landscape of some grassy park, with two hills like so, and I want us to imagine that you are physically standing at that point on the hill, on this little red hill in your park.

Turns out, that if you're standing at that point on the hill, you look all around and you find that the best direction is to take a little step downhill is roughly that direction.

Okay, and now you're at this new point on your hill. You're gonna, again, look all around and say what direction should I step in order to take a little baby step downhill? And if you do that and take another step, you take a step in that direction.

And then you keep going. From this new point you look around, decide what direction would take you downhill most quickly. Take another step, another step, and so on until you converge to this local minimum down here.

In gradient descent, what we're going to do is we're going to spin 360 degrees around, just look all around us, and ask, if I were to take a little baby step in some direction, and I want to go downhill as quickly as possible, what direction do I take that little baby step in? If I wanna go down, so I wanna physically walk down this hill as rapidly as possible.

I hope now you understand significance of gradient descent steps. Hope this is helpful!

Upvotes: 7

what is the meaning of iterations of neural network, gradient descent steps, epoch, batch size?

Answers (2)

Related Questions