Reputation: 35136

How do you train a linear model in tensorflow?

I have generated some input data in a CSV where 'awesomeness' is 'age * 10'. It looks like this:

age, awesomeness
67, 670
38, 380
32, 320
69, 690
40, 400

It should be trivial to write a tensorflow model that can predict 'awesomeness' from 'age', but I can't make it work.

When I run training, the output I get is:

accuracy: 0.0 <----------------------------------- What!??
accuracy/baseline_label_mean: 443.8
accuracy/threshold_0.500000_mean: 0.0
auc: 0.0
global_step: 6000
labels/actual_label_mean: 443.8
labels/prediction_mean: 1.0
loss: -2.88475e+09
precision/positive_threshold_0.500000_mean: 1.0
recall/positive_threshold_0.500000_mean: 1.0

Please note that this is obviously a completely contrived example, but that is because I was getting the same result with a more complex meaningful model with a much larger data set; 0% accuracy.

This is my attempt at the most minimal possible reproducible test case that I can make which exhibits the same behaviour.

Here's what I'm doing, based on the census example for the DNNClassifier from tflearn:

COLUMNS = ["age", "awesomeness"]
CONTINUOUS_COLUMNS = ["age"]
OUTPUT_COLUMN = "awesomeness"

def build_estimator(model_dir):
  """Build an estimator."""
  age = tf.contrib.layers.real_valued_column("age")
  deep_columns = [age]

  m = tf.contrib.learn.DNNClassifier(model_dir=model_dir,
                                     feature_columns=deep_columns,
                                     hidden_units=[50, 10])
  return m

def input_fn(df):
  """Input builder function."""
  feature_cols = {k: tf.constant(df[k].values, shape=[df[k].size, 1]) for k in CONTINUOUS_COLUMNS}
  output = tf.constant(df[OUTPUT_COLUMN].values, shape=[df[OUTPUT_COLUMN].size, 1])
  return feature_cols, output

def train_and_eval(model_dir, train_steps):
  """Train and evaluate the model."""
  train_file_name, test_file_name = training_data()
  df_train = pd.read_csv(...) # ommitted for clarity 
  df_test = pd.read_csv(...)

  m = build_estimator(model_dir)
  m.fit(input_fn=lambda: input_fn(df_train), steps=train_steps)

  results = m.evaluate(input_fn=lambda: input_fn(df_test), steps=1)
  for key in sorted(results):
    print("%s: %s" % (key, results[key]))

def training_data():
  """Return path to the training and test data"""
  training_datafile = path.join(path.dirname(__file__), 'data', 'data.training')
  test_datafile = path.join(path.dirname(__file__), 'data', 'data.test')
  return training_datafile, test_datafile

model_folder = 'scripts/model'  # Where to store the model
train_steps = 2000  # How many iterations to run while training
train_and_eval(model_folder, train_steps)

A couple of notes:

The original example tutorial this is based on is here https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/learn/wide_n_deep_tutorial.py
Notice I am using the DNNClassifier, not the LinearClassifier as I want specifically to deal with continuous input variables.
A lot of examples just use 'premade' data sets which are known to work with examples; my data set has been manually generated and is absolutely not random.
I have verified the csv loader is loading the data correctly as int64 values.
Training and test data are generated identically, but have different values in them; however, using data.training as the test data still returns a 0% accuracy, so there's no question that something isn't working, this isn't just over-fitting.

Upvotes: 0

Answers (3)

jaqen0

Reputation: 52

There are a few things I would like to say. Assuming you load the data correctly:

-This looks like a regression task and you are using a classifier. I'm not saying it doesn't work at all, but like this your are giving a label to each entry of age and training on the whole batch each epoch is very unstable.

-You are getting a huge value for the loss, your gradients are exploding. Having this toy dataset you probably need to tune hyperparameters like hidden neurons, learning rate and number of epochs. Try to log the loss value for each epoch and see if that may be the problem.

-Last suggestion, make your data work with a simpler model, possibly suited for your task, like a regression model and then scale up

Upvotes: 1

Doug

Reputation: 35136

See also https://github.com/tflearn/tflearn/blob/master/examples/basics/multiple_regression.py for using tflearn to solve this.

""" Multiple Regression/Multi target Regression Example

The input features have 10 dimensions, and target features are 2 dimension.

"""

from __future__ import absolute_import, division, print_function

import tflearn
import numpy as np

# Regression data- 10 training instances
#10 input features per instance.
X=np.random.rand(10,10).tolist()
#2 output features per instance
Y=np.random.rand(10,2).tolist()

# Multiple Regression graph, 10-d input layer
input_ = tflearn.input_data(shape=[None,10])
#10-d fully connected layer
r1 = tflearn.fully_connected(input_,10)
#2-d fully connected layer for output
r1 = tflearn.fully_connected(r1,2)
r1 = tflearn.regression(r1, optimizer='sgd', loss='mean_square',
                                        metric='R2', learning_rate=0.01)

m = tflearn.DNN(r1)
m.fit(X,Y, n_epoch=100, show_metric=True, snapshot_epoch=False)

#Predict for 1 instance
testinstance=np.random.rand(1,10).tolist()
print("\nInput features:  ",testinstance)
print("\n Predicted output: ")
print(m.predict(testinstance))

Upvotes: 0

dseuss

Reputation: 1141

First of all, you are describing a regression task, not a classification task. Therefore, both, DNNClassifier and LinearClassifier would be the wrong thing to use. That also makes accuracy the wrong quantity to use to tell if your model works or not. I suggest you read up on these two different context e.g. in the book "The Elements of Statistical Learning"

But here is a short answer to your problem. Say you have a linear model

awesomeness_predicted = slope * age

where slope is the parameter you want to learn from data. Lets say you have data age[0], ..., age[N] and the corresponding awesomeness values a_data[0],...,a_data[N]. In order to specify if your model works well, we are going to use mean squared error, that is

error = sum((a_data[i] - a_predicted[i])**2 for i in range(N))

What you want to do now is start with a random guess for slope and gradually improving using gradient descent. Here is a full working example in pure tensorflow

import tensorflow as tf
import numpy as np

DTYPE = tf.float32

## Generate Data
age = np.array([67, 38, 32, 69, 40])
awesomeness = 10 * age

## Generate model
# define the parameter of the model
slope = tf.Variable(initial_value=tf.random_normal(shape=(1,), dtype=DTYPE))
# define the data inputs to the model as variable size tensors
x = tf.placeholder(DTYPE, shape=(None,))
y_data = tf.placeholder(DTYPE, shape=(None,))
# specify the model
y_pred = slope * x
# use mean squared error as loss function
loss = tf.reduce_mean(tf.square(y_data - y_pred))
target = tf.train.AdamOptimizer().minimize(loss)

## Train Model
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(100000):
        _, training_loss = sess.run([target, loss],
                                    feed_dict={x: age, y_data: awesomeness})
    print("Training loss: ", training_loss)
    print("Found slope=", sess.run(slope))

Upvotes: 1

How do you train a linear model in tensorflow?

Answers (3)

Related Questions