Reputation: 2067
Is tf.train.GradientDescentOptimizer
a vanilla gradient descent? i.e. not SGD, so it is equivalent to a gradient update implemented in numpy.
Upvotes: 0
Views: 430
Reputation: 4201
Yes, it's the vanilla gradient descent. You can't say it is not SGD. Because it depends on your number of examples you consider when training a single epoch.
Which means if you use only a mini-batch (ideally one single instance, but mini-batch is also fine) of data for a single epoch, We call it SGD.
Yes functionally it should be equivalent to numpy
implementation.
Upvotes: 1