Fcoder
Fcoder

Reputation: 9216

Classifying handwritten digits with single layer perceptron

I want to classify handwritten digits(MNIST) with a simple Python code. My method is a simple single layer perceptron and i do it with batch method.

My problem is that for example, If I train digit "1" and then then other digits, networks always shows result for "1". In fact training happens for first digit. I don't know what's the problem.

I'm thinking this is related to batch training that after one time training, second digit can't because network converged. but I cant how to solve it.

I tested with multi layer perceptron and I get the same behaviour.

NOTE: every time i choose one digit and load a lot of them and start training, and for others digits i restart every thing except weight matrix(w0)

this is my code:

1-importing libraries:

import os, struct
from array import array as pyarray
from numpy import append, array, int8, uint8, zeros
import numpy as np
from IPython.display import Image
import matplotlib.pyplot as plt
from IPython import display
from scipy.special import expit
from scipy.misc import imresize
from IPython.core.page import page
from IPython.core.formatters import format_display_data

np.set_printoptions(threshold=np.nan)
np.set_printoptions(suppress=True)

2- Sigmoid function:

def sigmoid(x, deriv=False):
    if(deriv==True):
        return x*(1-x)
    return expit(x)

3- Initializing weights

np.random.seed(1)
w0 = 2*np.random.random((784,10))-1

4- Reading MNIST dataset

dataset="training"
path="."

if dataset == "training":
    fname_img = os.path.join(path, 'train-images-idx3-ubyte')
    fname_lbl = os.path.join(path, 'train-labels-idx1-ubyte')
elif dataset == "testing":
    fname_img = os.path.join(path, 't10k-images-idx3-ubyte')
    fname_lbl = os.path.join(path, 't10k-labels-idx1-ubyte')
else:
    raise ValueError("dataset must be 'testing' or 'training'")

flbl = open(fname_lbl, 'rb')
magic_nr, size = struct.unpack(">II", flbl.read(8))
lbl = pyarray("b", flbl.read())
flbl.close()

fimg = open(fname_img, 'rb')
magic_nr, size, rows, cols = struct.unpack(">IIII", fimg.read(16))
img = pyarray("B", fimg.read())
fimg.close()

5- Choosing a number

number = 4
digits=[number]
ind = [ k for k in range(size) if lbl[k] in digits ]
N = len(ind)

images = zeros((N, rows, cols), dtype=uint8)
labels = zeros((N, 1), dtype=int8)

for i in range(len(ind)):
    images[i] = array(img[ ind[i]*rows*cols : (ind[i]+1)*rows*cols ]).reshape((rows, cols))
    labels[i] = lbl[ind[i]]

6- Converting each digit to a vector and converting matrix cells to binary:

p = np.reshape(images,(len(images),784))
p[p > 0] = 1

7- Target matrix(each column for a digit)

t = np.zeros((len(images), 10),dtype=float)
t[:,number] = 1

8- Training(Gradient descent)

for iter in xrange(600):
    predict = sigmoid(np.dot(p,w0))
    e0 = predict - t
    delta0 = e0 * sigmoid(predict,True)
    w0 -= 0.01*np.dot(p.T,delta0)

9- testing

test_predict = sigmoid(np.dot(p[102],w0))
print test_predict

Upvotes: 0

Views: 2766

Answers (2)

Andy Jang
Andy Jang

Reputation: 1

If your goal is to make a Perceptron that can classify a certain digit, the initialization of weights (step 3) should be done right before the training (step 8) so that the weights are initialized every time you train the model (for different digits).

In summary, I would move #3 right before #8.

Upvotes: 0

Frank Puffer
Frank Puffer

Reputation: 8215

It makes no sense to train the network with data from a single class (digit) until it converges, then add another class and so on.

If you only train with one class, the desired output will always be the same and the network will probably converge quickly. It will probably produce this output for all kinds of input patterns, not just the ones you used for training.

What you need to do is present inputs from all classes during training, for example in random order. This way the network will be able to find the boundaries between the different classes.

Upvotes: 2

Related Questions