Neural Networks - why is my training error increasing as I add hidden units (neurons)?

Question

I'm trying to optimise the number of hidden units in my MLP.

I'm using k-fold cross validation, with 10 folds - 16200 training points and 1800 validation points in each fold.

When I run the network with hidden units varying from 1:10, I find the minimum error always occurs at 2 (NMSE of about 7). 3 is slightly higher (NMSE of about 11) and 4 or more hidden units and the error remains constant at about 14 or 15 regardless of many I add.

Why is this?

I find it hard to believe that overfitting is occurring, because of the very large amount of data points being used (with all 10 folds, that's 162000 training points, albeit each repeated 9 times).

Many thanks for any help or advice!

Lukasz Tracewski · Accepted Answer

If the input is voltage and current, and question is about the power generated, then it's just P=V*I. Even if you have some noise, the relationship will be still linear. In this case simple linear model would do just fine - and would be far nicer to interpret! That's why simple ANN works best and more complex is overfitting, as it looks for non-linear relationships (which are not there, but it does whatever will minimise cost function).

To summarise, I would recommend to check a simple linear model. Also, since you have a lot of data points, make a 50-25-25 split for training, test and validation sets. Look at your cost function and see how it changes with error rate.

Neural Networks - why is my training error increasing as I add hidden units (neurons)?

Answers (1)

Related Questions