Reputation: 293
I am not able to find anything about gradient ascent. Any good link about gradient ascent demonstrating how it is different from gradient descent would help.
Upvotes: 28
Views: 31400
Reputation: 131
Gradient Descent is used to minimize a particular function whereas gradient ascent is used to maximize a function.
Check this out http://pandamatak.com/people/anand/771/html/node33.html
Upvotes: 8
Reputation: 41
gradient ascent is maximizing of the function so as to achieve better optimization used in reinforcement learning it gives upward slope or increasing graph.
gradient descent is minimizing the cost function used in linear regression it provides a downward or decreasing slope of cost function.
Upvotes: 4
Reputation: 4931
If you want to minimize a function, we use Gradient Descent. For eg. in Deep learning we want to minimize the loss function hence we use Gradient Descent.
If you want to maximize a function, we use Gradient Ascent. For eg. in Reinforcement Learning - Policy Gradient methods our goal is to maximize the reward/expected return function hence we use Gradient Ascent.
Upvotes: 3
Reputation: 62
Gradient is another word for slope. The positive gradient of the graph at a point (x,y) means that the graph slopes upwards at a point (x,y). On the other hand, the negative gradient of the graph at a point (x,y) means that the graph slopes downwards at a point (x,y).
Gradient descent is an iterative algorithm which is used to find a set of theta that minimizes the value of a cost function. Therefore, gradient ascent would produce a set of theta that maximizes the value of a cost function.
Upvotes: 0
Reputation:
Typically, you'd use gradient ascent to maximize a likelihood function, and gradient descent to minimize a cost function. Both gradient descent and ascent are practically the same. Let me give you an concrete example using a simple gradient-based optimization friendly algorithm with a concav/convex likelihood/cost function: logistic regression.
Unfortunately, SO still doesn't seem to support LaTeX, so let me post a few screenshots.
The likelihood function that you want to maximize in logistic regression is
where "phi" is simply the sigmoid function
Now, you want to a concav funcion for gradient ascent, thus take the log:
Similarly, you can just write it as its inverse to get the cost function that you can minimize via gradient descent.
For the log-likelihood, you'd derive and apply the gradient ascent as follows:
Since you'd want to update all weights simultaneously, let's write it as
Now, it should be quite obvious to see that the gradient descent update is the same as the gradient ascent one, only keep in mind that we are formulating it as "taking a step into the opposite direction of the gradient of the cost function"
Hope that answers your question!
Upvotes: 25
Reputation: 66886
It is not different. Gradient ascent is just the process of maximizing, instead of minimizing, a loss function. Everything else is entirely the same. Ascent for some loss function, you could say, is like gradient descent on the negative of that loss function.
Upvotes: 30