Reputation: 3520
I'm trying to develop a new weight initialization method but i'm getting a weird training phenomenon. You can see that output node 8 is never the max activation...
I'm using the matlab patternnet with tansig activation, mse performance, and no bias nodes. I'm trying to classify a subset of the mnist database.
Does anyone have any ideas how to troubleshoot this? Using nguyen-widrow initialization does not see this result, despite having the same architecture.
edit:
Inputs: 768xN of values between 0 and 1
Targets: 10xN of values 0 or 1 per respective row. So its like a logic matrix with 1 true per column.
One or more nodes do not activate, i showed the best case.
This occurs with one or more layers (1 to 5), less or more training data (1k to 10k samples.)
Upvotes: 2
Views: 371
Reputation: 3520
I think i found a solution to the problem.
By scaling the weights to be only along the significant domain of the transfer function (-1 to 1), i no longer saw this phenomena.
Upvotes: 1