Reputation: 667
I was following Andrew Ng's course on Ml and in the Neural network week 4 slide,while talking on modal representation 1 , he mentions that the dimension of the weight matrix is 3X4 as shown below:
I know there is a formula that tells that if there as Sj nodes in the jth layer and sj+1 nodes in the j+1 layer then the dimension of the matrix mapping from the j to j+1 layer will be S(j+1) X (Sj + 1).
But I dont know how the formula came and hence not able to understand the above example.
Upvotes: 1
Views: 1536
Reputation: 620
From the image we understand the following :
For example, In the first layer, j=1 and the number of nodes,sj = 3.
In the architecture shown we have three layers.
Now, theta is the matrix of weights between the layers. Theta is also known as weights or parameters.We have three layers and hence two matrices, theta1 and theta2.
weight_matrix1(Theta1) mapping is between input layer and the hidden layer
weight_matrix2(Theta2) mapping is between the hidden layer and the output layer
By formula:
The dimensions of the weight matrix1(between s2 and s1) = s2 * (s1+1) = 3 *(3+1) = 3 * 4
The dimensions of the weight matrix2(between s3 and s2) = s3 * (s2+1) = 1 *(3+1) = 1 * 4
We add 1 to account for the bias layer x0. For matrix multiplication, the number of rows in matrix1 should be equal to the number of columns in matrix2. We know the dimensions of layer1 and the dimensions of layer 2 in this example, from which we calculate the dimensions of weight_matrix theta.
Upvotes: 0
Reputation: 1254
The size of a Theta (weight) matrix is (outputs x inputs).
The input includes a bias unit.
The output doesn't include the bias unit.
In the diagram, it will be [3 x (3+1)]. Here the additional 1 is the bias unit added to the input.
Hence the simple formula is S(j+1) X (Sj + 1) which is 3 x (3+1)
Upvotes: 2