Reputation: 47
I have a model with N
inputs and 6
outputs after each epoch.
My output looks like, [x y z xx yy zz]
and I want to minimize the MSE of each term. However, I've noticed that when I use MSE as a loss function, it is just taking the mean of the sum of the squares of the entire set.
Upvotes: 3
Views: 2047
Reputation: 9075
I think they both mean the same thing. Let us denote your predictions for i^th
sample by [x_i, y_i, z_i, xx_i, yy_i, zz_i]
. The true values are denoted by [t_x_i, t_y_i, t_z_i, t_xx_i, t_yy_i, t_zz_i]
Over a batch of N
samples, you want to minimize:
L = \sum_i=1^N ((x_i-t_x_i)^2)/N + ... + \sum_i=1^N ((zz_i-t_zz_i)^2)/N
The MSE loss will minimize the following:
L = (1/N) * \sum_i=1^N ((1/6) * [(x_i - t_x_i)^2 + ... + (zz_i-t_zz_i)^2])
You can see that both finally minimize the same quantity.
I think this will stand true in case your six outputs are independent variables, which I think they are, since you model them as six distinct outputs with six ground truth labels.
Upvotes: 2
Reputation: 1651
You have to create a tensor equal to MSE and minimize that.
mse = tf.reduce_mean(tf.square(outputs))
train_step = tf.train.*Optimizer(...).minimize(mse)
for _ in range(iterations):
sess.run(train_step ... )
Upvotes: 1