Reputation: 11
I'm trying to use a linear regression program to predict handwritten numbers using the mnist dataset. Whenever I have tried running it, the gradient descent function always takes a while to work and it is taking a long time to approach the correct weights. In eight hours it has gone through the function 550 times and there is still a lot of error. Can someone tell me if it normally takes this long, or if I am doing something wrong.
import numpy as np
import pandas as pd
mnist = pd.read_csv('mnist_train.csv')[:4200]
x = np.array(mnist)[:4200,1:]
y = np.array(mnist)[:4200,0].reshape(4200,1)
#How many numbers in dataset
n = len(x)
#How many values in each number
n1 = len(x[0])
#sets all weights equal to 1
coef = np.array([1 for i in range(n1)])
epochs = 1000000000000
learning_rate = .000000000008999
for i in range(epochs):
cur_y = sum(x*coef)
error = y-cur_y
#Calculates Gradient
grad = (np.array([sum(sum([-2/n * (error)* x[j,i] for j in range(n)])) for i in range(n1)]))
#Updates Weights
coef = (-learning_rate * grad) + coef
print(i)
print(sum(y-(x*coef)))
Upvotes: 0
Views: 87
Reputation: 1326
Your learning rate is extremely tiny. Also, 784 is a lot of dimensions for linear regression to tackle, especially assuming you're using all 60,000 samples. An SVM would work better and obviously, a CNN would be best.
Given your error is getting smaller I would recommend increasing your learning rate and train using stochastic gradients (grabbing random batches from your training set for each epoch instead of the whole training set).
Upvotes: 1