user2714423
user2714423

Reputation: 293

What is the difference between gradient descent and gradient ascent?

I am not able to find anything about gradient ascent. Any good link about gradient ascent demonstrating how it is different from gradient descent would help.

Upvotes: 28

Views: 31400

Answers (6)

vishalg
vishalg

Reputation: 131

Gradient Descent is used to minimize a particular function whereas gradient ascent is used to maximize a function.

Check this out http://pandamatak.com/people/anand/771/html/node33.html

Upvotes: 8

Anjali Yadav
Anjali Yadav

Reputation: 41

gradient ascent is maximizing of the function so as to achieve better optimization used in reinforcement learning it gives upward slope or increasing graph.

gradient descent is minimizing the cost function used in linear regression it provides a downward or decreasing slope of cost function.

Upvotes: 4

Shriram
Shriram

Reputation: 4931

If you want to minimize a function, we use Gradient Descent. For eg. in Deep learning we want to minimize the loss function hence we use Gradient Descent.

If you want to maximize a function, we use Gradient Ascent. For eg. in Reinforcement Learning - Policy Gradient methods our goal is to maximize the reward/expected return function hence we use Gradient Ascent.

Upvotes: 3

John Doe
John Doe

Reputation: 62

Gradient is another word for slope. The positive gradient of the graph at a point (x,y) means that the graph slopes upwards at a point (x,y). On the other hand, the negative gradient of the graph at a point (x,y) means that the graph slopes downwards at a point (x,y).

Gradient descent is an iterative algorithm which is used to find a set of theta that minimizes the value of a cost function. Therefore, gradient ascent would produce a set of theta that maximizes the value of a cost function.

Upvotes: 0

user2489252
user2489252

Reputation:

Typically, you'd use gradient ascent to maximize a likelihood function, and gradient descent to minimize a cost function. Both gradient descent and ascent are practically the same. Let me give you an concrete example using a simple gradient-based optimization friendly algorithm with a concav/convex likelihood/cost function: logistic regression.

Unfortunately, SO still doesn't seem to support LaTeX, so let me post a few screenshots.

The likelihood function that you want to maximize in logistic regression is

enter image description here

where "phi" is simply the sigmoid function

enter image description here

Now, you want to a concav funcion for gradient ascent, thus take the log:

enter image description here

Similarly, you can just write it as its inverse to get the cost function that you can minimize via gradient descent.

enter image description here

For the log-likelihood, you'd derive and apply the gradient ascent as follows:

enter image description here

enter image description here

Since you'd want to update all weights simultaneously, let's write it as

enter image description here

Now, it should be quite obvious to see that the gradient descent update is the same as the gradient ascent one, only keep in mind that we are formulating it as "taking a step into the opposite direction of the gradient of the cost function"

enter image description here

Hope that answers your question!

Upvotes: 25

Sean Owen
Sean Owen

Reputation: 66886

It is not different. Gradient ascent is just the process of maximizing, instead of minimizing, a loss function. Everything else is entirely the same. Ascent for some loss function, you could say, is like gradient descent on the negative of that loss function.

Upvotes: 30

Related Questions