Reputation: 8627
I was just wondering if someone could provide a good source for me to read on how I should approach choosing hyper-parameters of the solver based on the complexity of my problem.
Basically, I understand that many feel that they are "shooting around in the dark" when it comes to setting and then modifying these parameters and a system or benchmark for choosing parameters based on specific problem/data complexity has escaped me.
If you care to explain your own methodology or simply provide commentary on your source, it would be much appreciated.
Upvotes: 2
Views: 187
Reputation: 1644
Since the hyperparameters we're talking about are related to backpropagation, which is a gradient-based approach, I believe the main reference is Y. Bengio, along with the more classic Lecun et al..
There are three main approaches to find out the optimal value for an hyperparameter. The first two are well explained in the first paper I linked.
Upvotes: 3
Reputation: 134
I think this is the main reference:
http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Also take a look at Chapter 5 in: http://neuralnetworksanddeeplearning.com/
Upvotes: 0