Tony
Tony

Reputation: 1393

linear regression by tensorflow gets noticeable mean square error

I am new to tensorflow and I am trying to implement a simple feed-forward network for regression, just for learning purposes. The complete executable code is as follows.

The regression mean squared error is around 6, which is quite large. It is a little unexpected because the function to regress is linear and simple 2*x+y, and I expect a better performance.

I am asking for help to check if I did anything wrong in the code. I carefully checked the matrix dimensions so that should be good, but it is possible that I misunderstand something so the network or the session is not properly configured (like, should I run the training session multiple times, instead of just one time (the code below enclosed by #TRAINING#)? I see in some examples they input data piece by piece, and run the training progressively. I run the training just one time and input all data).

If the code is good, maybe this is a modeling issue, but I really don't expect to use a complicated network for such a simple regression.

import tensorflow as tf
import numpy as np
from sklearn.metrics import mean_squared_error

# inputs are points from a 100x100 grid in domain [-2,2]x[-2,2], total 10000 points
lsp = np.linspace(-2,2,100)
gridx,gridy = np.meshgrid(lsp,lsp)
inputs = np.dstack((gridx,gridy))
inputs = inputs.reshape(-1,inputs.shape[-1]) # reshpaes the grid into a 10000x2 matrix
feature_size = inputs.shape[1] # feature_size is 2, features are the 2D coordinates of each point
input_size = inputs.shape[0] # input_size is 10000

# a simple function f(x)=2*x[0]+x[1] to regress
f = lambda x: 2 * x[0] + x[1]
label_size = 1
labels = f(inputs.transpose()).reshape(-1,1) # reshapes labels as a column vector

ph_inputs = tf.placeholder(tf.float32, shape=(None, feature_size), name='inputs')
ph_labels = tf.placeholder(tf.float32, shape=(None, label_size), name='labels')

# just one hidden layer with 16 units
hid1_size = 16
w1 = tf.Variable(tf.random_normal([hid1_size, feature_size], stddev=0.01), name='w1')
b1 = tf.Variable(tf.random_normal([hid1_size, label_size]), name='b1')
y1 = tf.nn.relu(tf.add(tf.matmul(w1, tf.transpose(ph_inputs)), b1))

# the output layer
wo = tf.Variable(tf.random_normal([label_size, hid1_size], stddev=0.01), name='wo')
bo = tf.Variable(tf.random_normal([label_size, label_size]), name='bo')
yo = tf.transpose(tf.add(tf.matmul(wo, y1), bo))

# defines optimizer and predictor
lr = tf.placeholder(tf.float32, shape=(), name='learning_rate')
loss = tf.losses.mean_squared_error(ph_labels,yo)
optimizer = tf.train.GradientDescentOptimizer(lr).minimize(loss)
predictor = tf.identity(yo)

# TRAINING 
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
_, c = sess.run([optimizer, loss], feed_dict={lr:0.05, ph_inputs: inputs, ph_labels: labels})
# TRAINING 

# gets the regression results
predictions = np.zeros((input_size,1))
for i in range(input_size):
    predictions[i] = sess.run(predictor, feed_dict={ph_inputs: inputs[i, None]}).squeeze()

# prints regression MSE
print(mean_squared_error(predictions, labels))

Upvotes: 2

Views: 553

Answers (1)

nessuno
nessuno

Reputation: 27052

You're right, you understood the problem by yourself.

The problem is, in fact, that you're running the optimization step only one time. Hence you're doing one single update step of your network parameter and therefore the cost won't decrease.

I just changed the training session of your code in order to make it work as expected (100 training steps):

# TRAINING
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
for i in range(100):
    _, c = sess.run(
        [optimizer, loss],
        feed_dict={
            lr: 0.05,
            ph_inputs: inputs,
            ph_labels: labels
        })
    print("Train step {} loss value {}".format(i, c))
# TRAINING

and at the end of the training step I go:

Train step 99 loss value 0.04462708160281181

0.044106700712455045

Upvotes: 4

Related Questions