Reputation: 673
"Neural nets have a weight space symmetry: we can permute all the hidden units in a given layer and obtain an equivalent solution" (From CSC321, lecture 10, Optimation)
I don't think it make sense, is there something wrong with my understanding?
For example, there is a simple DNN with 2 units in the only hidden layer. And there is one local optima and one global optima like this:
Obviously 2 symmetric points will result in different solution, they will go into different optima(the right-bottom one is the global optima).
Please tell me where it goes wrong?
Upvotes: 1
Views: 1565
Reputation: 39
Sure, if you permute the weights of input layers randomly - you'll not come with the same result. Becase the order of input elements matter.
The permutations symetry is about permuting the neurons of hidden layers, not about permuting weights of single neuron.
For example, your hidden layer has 2 neorons with weights w11, w12, w13 and w21 w22, w23.
So the permutation principle states that you can easily permute w11 <-> w21, w12<->w22 and w13<->w23 and the result will remain the same
Upvotes: 0
Reputation: 310
I think you miss the definition of symmetry.
Geometry is the branch of mathematics studying invariants under some class of transformations. The invariants of a geometry are called the symmetry of the geometry. For instance, the symmetries of Euclidean geometry is length and angles because rotations and translations (the group of Euclidean transformations) preserve them. Simply put, in Euclidean geometry, length and angles are the symmetries of the geometry. In the same vein, the symmetry of the affine geometry is parallelism.
In the context of deep learning, weight space symmetry means that non-identifiable models are invariant to random permutations in their weight layers. This symmetry holds since in deep learning there are generally not enough training samples to rule out all parameter settings but one, there usually exist a large amount of possible weight combinations for a given dataset that yield similar model performance.
Upvotes: 1
Reputation: 37
the weight symmetry here means that there is an equivalent weight that maps the input to output. It doesn't mean the geometrical symmetry in coordinate space. You can have a deeper look in Bishop Ch5.1
Upvotes: -2