Reputation: 291
I am using Tensorflow 1.2, here's the code:
import tensorflow as tf
import tensorflow.contrib.layers as layers
import numpy as np
import tensorflow.contrib.learn as tflearn
tf.logging.set_verbosity(tf.logging.INFO)
# Naturally this is a very simple straight line
# of y = -x + 10
train_x = np.asarray([0., 1., 2., 3., 4., 5.])
train_y = np.asarray([10., 9., 8., 7., 6., 5.])
test_x = np.asarray([10., 11., 12.])
test_y = np.asarray([0., -1., -2.])
input_fn_train = tflearn.io.numpy_input_fn({"x": train_x}, train_y, num_epochs=1000)
input_fn_test = tflearn.io.numpy_input_fn({"x": test_x}, test_y, num_epochs=1000)
validation_monitor = tflearn.monitors.ValidationMonitor(
input_fn=input_fn_test,
every_n_steps=10)
fts = [layers.real_valued_column('x')]
estimator = tflearn.LinearRegressor(feature_columns=fts)
estimator.fit(input_fn=input_fn_train,
steps=1000,
monitors=[validation_monitor])
print(estimator.evaluate(input_fn=input_fn_test))
It runs as expected. What's happening is that the training stops at step 47 with a very high loss value:
INFO:tensorflow:Starting evaluation at 2017-06-18-20:52:10
INFO:tensorflow:Finished evaluation at 2017-06-18-20:52:10
INFO:tensorflow:Saving dict for global step 1: global_step = 1, loss = 12.5318
INFO:tensorflow:Validation (step 10): global_step = 1, loss = 12.5318
INFO:tensorflow:Saving checkpoints for 47 into
INFO:tensorflow:Loss for final step: 19.3527.
INFO:tensorflow:Starting evaluation at 2017-06-18-20:52:11
INFO:tensorflow:Restoring parameters from
INFO:tensorflow:Finished evaluation at 2017-06-18-20:52:11
INFO:tensorflow:Saving dict for global step 47: global_step = 47, loss = 271.831
{'global_step': 47, 'loss': 271.83133}
Few things I completely don't understand (admittedly I'm a complete noob in TF):
I have imlemented this very algorithm using vanilla TensorFlow and it works as expected, but I really can't get the grasp of what LinearRegressor wants from me here.
Upvotes: 2
Views: 573
Reputation: 41
here are some (partial) answers to your questions. Might not address all your questions but hopefully will give you some more insights.
Why TF decides to stop the training anyway after? This has to do with the fact that you have set num_epochs=1000 and the default batch_size of numpy_input_fn is 128 (see https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/learn_io/numpy_io.py). num_epochs=1000 means that fit method will go through the data at most 1000 times (or 1000 steps, whichever occurs first). That's why fit runs for ceiling(1000 * 6 /128)=47 steps. Setting batch_size to 6 (the size of your training dataset) or num_epochs=None will give you more reasonable results (I suggest setting batch_size to at most 6 since using your training samples cyclically more than once in a single step might not make much sense)
Why the loss on step 10 is smaller than loss on step 47? There are a few different reasons the loss might not decrease. a. not computing the loss on the exact same data at each step. For instance if you sample has size 100 and your batch_size is 32, every step you will compute loss on the next batch of size 32 (this continues cyclically) b. Your learning rate is too high so the loss bounces. To fix this, maybe try to reduce the learning rate or even experiment with different optimizers. I believe by default, the optimizer used in LinearRegressor is FtrlOptimizer. You can change its default learning rate using the following command when you construct LinearRegressor:
estimator =tflearn.LinearRegressor( feature_columns=fts, optimizer=tf.train.FtrlOptimizer(learning_rate=...))
Alternatively, you can try a different optimizer altogether. estimator = tflearn.LinearRegressor( feature_columns=fts, optimizer=tf.train.GradientDescentOptimizer(learning_rate=...))
Upvotes: 3