Batch gradient descent algorithm does not converge

Question

I'm trying to implement batch grandient descent algorithm for my machine learning homework. I have a training set, whose x value is around 10^3 and y value is around 10^6. I'm trying to find the value of [theta0, theta1] which makes y = theta0 + theta1 * x converge. I set the learning rate to 0.0001 and maximum interation to 10. Here's my code in Qt.

QVector gradient_descent_batch(QVector x, QVectory)
{
    QVector theta(0);
    theta.resize(2);

    int size = x.size();

    theta[1] = 0.1;
    theta[0] = 0.1;

    for (int j=0;j



I print the value of theta every interation, and here's the result.

QVector(921495, 2.29367e+09) 
QVector(-8.14503e+12, -1.99708e+16) 
QVector(7.09179e+19, 1.73884e+23) 
QVector(-6.17475e+26, -1.51399e+30) 
QVector(5.3763e+33, 1.31821e+37) 
QVector(-4.68109e+40, -1.14775e+44) 
QVector(4.07577e+47, 9.99338e+50) 
QVector(-3.54873e+54, -8.70114e+57) 
QVector(3.08985e+61, 7.57599e+64) 
QVector(-2.6903e+68, -6.59634e+71) 


I seems that theta will never converge. 
I follow the solution here to set learning rate to 0.00000000000001 and maximum iteration to 20. But it seems will not converge. Here's the result.

QVector(0.100092, 0.329367) 
QVector(0.100184, 0.558535) 
QVector(0.100276, 0.787503) 
QVector(0.100368, 1.01627) 
QVector(0.10046, 1.24484) 
QVector(0.100552, 1.47321) 
QVector(0.100643, 1.70138) 
QVector(0.100735, 1.92936) 
QVector(0.100826, 2.15713) 
QVector(0.100918, 2.38471) 
QVector(0.101009, 2.61209) 
QVector(0.1011, 2.83927) 
QVector(0.101192, 3.06625) 
QVector(0.101283, 3.29303) 
QVector(0.101374, 3.51962) 
QVector(0.101465, 3.74601) 
QVector(0.101556, 3.9722) 
QVector(0.101646, 4.1982) 
QVector(0.101737, 4.424) 
QVector(0.101828, 4.6496) 


What's wrong?

yfy · Accepted Answer

So firstly your algorithm seems fine except that you should divide LRATE by size;

theta[0] = theta0 - LRATE * dJ0 / size;
theta[1] = theta1 - LRATE * dJ1 / size;

What I would suggest you should calculate cost function and monitor it;

Cost function

Your cost should be decreasing on every iteration. If its bouncing back and forward you are using a large value of learning rate. I would suggest you to use 0.01 and do 400 iterations.

Batch gradient descent algorithm does not converge

Answers (1)

Related Questions