Maysam
Maysam

Reputation: 7367

Training neural network tips

For object recognition propose I've to use a neural network in MATLAB. I have 30 objects and 20 images for each object, so I have 600 input data and 20 different classes. Input matrix is 100x600 and target is 1x600. Input matrix columns is a histogram of keypoints' Hue in 100 bins like this: (m,n)=hist(hue_val,100) that I took m.
If I chose MLP network, how many layers and neurons for those layers are needed, which transfer functions is suitable for each layer?

And for last question, do I need negative samples?

Upvotes: 3

Views: 2706

Answers (2)

Cristian Rodriguez
Cristian Rodriguez

Reputation: 813

When i asked this to myself i found this page maybe this can help.

Edit:

Sorry i wanted to link to this page, where you can going to the different ask like a How many hidden layers should I use? or How many hidden units should I use?

Upvotes: 3

zergylord
zergylord

Reputation: 4446

  • Number of layers - In general a single hidden layer is sufficient since (so long your using a non-linear activation function) a single layer can approximate an arbitrary number of layers.
  • Transfer function - I'm not used to this term, but I assume you mean the activation function (what you do to the net input before passing it to the next layer). I answered a slight variant of that question here, but the gist is that a standard choice like hyperbolic tangent or logistic works in most cases.
  • Number of neurons in your hidden layer - crodriguezo's link fielded this one quite nicely. All I can really add is that with your input size, I'd probably base this quantity a lot on training time.
  • Negative samples - If you just need to classify which of the 30 objects an input belongs too, then negative samples aren't needed. But, if a test input might be none of the 30 objects, then definitely use lots of negative example so the network doesn't think everything is an object.

Some General Tips:

Remember to consider leave-one-out and similar forms cross validation as ways of combating overfitting. Limiting your hidden layer's unit count does this also, but at the cost of representational richness.

Other parameters you haven't mentioned are also very important to any successful ANN application. These include learning rate, error function, annealing schedule, momentum, and weight decay. Setting all of these is more of an art than a science at this point (one of the best arguments against using ANNs vs. support vector machinces), but this link has been a godsend for me in this area.

Upvotes: 5

Related Questions