Reputation: 35136
I have generated some input data in a CSV where 'awesomeness' is 'age * 10'. It looks like this:
age, awesomeness
67, 670
38, 380
32, 320
69, 690
40, 400
It should be trivial to write a tensorflow model that can predict 'awesomeness' from 'age', but I can't make it work.
When I run training, the output I get is:
accuracy: 0.0 <----------------------------------- What!??
accuracy/baseline_label_mean: 443.8
accuracy/threshold_0.500000_mean: 0.0
auc: 0.0
global_step: 6000
labels/actual_label_mean: 443.8
labels/prediction_mean: 1.0
loss: -2.88475e+09
precision/positive_threshold_0.500000_mean: 1.0
recall/positive_threshold_0.500000_mean: 1.0
Please note that this is obviously a completely contrived example, but that is because I was getting the same result with a more complex meaningful model with a much larger data set; 0% accuracy.
This is my attempt at the most minimal possible reproducible test case that I can make which exhibits the same behaviour.
Here's what I'm doing, based on the census example for the DNNClassifier from tflearn:
COLUMNS = ["age", "awesomeness"]
CONTINUOUS_COLUMNS = ["age"]
OUTPUT_COLUMN = "awesomeness"
def build_estimator(model_dir):
"""Build an estimator."""
age = tf.contrib.layers.real_valued_column("age")
deep_columns = [age]
m = tf.contrib.learn.DNNClassifier(model_dir=model_dir,
feature_columns=deep_columns,
hidden_units=[50, 10])
return m
def input_fn(df):
"""Input builder function."""
feature_cols = {k: tf.constant(df[k].values, shape=[df[k].size, 1]) for k in CONTINUOUS_COLUMNS}
output = tf.constant(df[OUTPUT_COLUMN].values, shape=[df[OUTPUT_COLUMN].size, 1])
return feature_cols, output
def train_and_eval(model_dir, train_steps):
"""Train and evaluate the model."""
train_file_name, test_file_name = training_data()
df_train = pd.read_csv(...) # ommitted for clarity
df_test = pd.read_csv(...)
m = build_estimator(model_dir)
m.fit(input_fn=lambda: input_fn(df_train), steps=train_steps)
results = m.evaluate(input_fn=lambda: input_fn(df_test), steps=1)
for key in sorted(results):
print("%s: %s" % (key, results[key]))
def training_data():
"""Return path to the training and test data"""
training_datafile = path.join(path.dirname(__file__), 'data', 'data.training')
test_datafile = path.join(path.dirname(__file__), 'data', 'data.test')
return training_datafile, test_datafile
model_folder = 'scripts/model' # Where to store the model
train_steps = 2000 # How many iterations to run while training
train_and_eval(model_folder, train_steps)
A couple of notes:
The original example tutorial this is based on is here https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/learn/wide_n_deep_tutorial.py
Notice I am using the DNNClassifier, not the LinearClassifier as I want specifically to deal with continuous input variables.
A lot of examples just use 'premade' data sets which are known to work with examples; my data set has been manually generated and is absolutely not random.
I have verified the csv loader is loading the data correctly as int64 values.
Training and test data are generated identically, but have different values in them; however, using data.training as the test data still returns a 0% accuracy, so there's no question that something isn't working, this isn't just over-fitting.
Upvotes: 0
Views: 488
Reputation: 52
There are a few things I would like to say. Assuming you load the data correctly:
-This looks like a regression task and you are using a classifier. I'm not saying it doesn't work at all, but like this your are giving a label to each entry of age and training on the whole batch each epoch is very unstable.
-You are getting a huge value for the loss, your gradients are exploding. Having this toy dataset you probably need to tune hyperparameters like hidden neurons, learning rate and number of epochs. Try to log the loss value for each epoch and see if that may be the problem.
-Last suggestion, make your data work with a simpler model, possibly suited for your task, like a regression model and then scale up
Upvotes: 1
Reputation: 35136
See also https://github.com/tflearn/tflearn/blob/master/examples/basics/multiple_regression.py for using tflearn to solve this.
""" Multiple Regression/Multi target Regression Example
The input features have 10 dimensions, and target features are 2 dimension.
"""
from __future__ import absolute_import, division, print_function
import tflearn
import numpy as np
# Regression data- 10 training instances
#10 input features per instance.
X=np.random.rand(10,10).tolist()
#2 output features per instance
Y=np.random.rand(10,2).tolist()
# Multiple Regression graph, 10-d input layer
input_ = tflearn.input_data(shape=[None,10])
#10-d fully connected layer
r1 = tflearn.fully_connected(input_,10)
#2-d fully connected layer for output
r1 = tflearn.fully_connected(r1,2)
r1 = tflearn.regression(r1, optimizer='sgd', loss='mean_square',
metric='R2', learning_rate=0.01)
m = tflearn.DNN(r1)
m.fit(X,Y, n_epoch=100, show_metric=True, snapshot_epoch=False)
#Predict for 1 instance
testinstance=np.random.rand(1,10).tolist()
print("\nInput features: ",testinstance)
print("\n Predicted output: ")
print(m.predict(testinstance))
Upvotes: 0
Reputation: 1141
First of all, you are describing a regression task, not a classification task. Therefore, both, DNNClassifier and LinearClassifier would be the wrong thing to use. That also makes accuracy the wrong quantity to use to tell if your model works or not. I suggest you read up on these two different context e.g. in the book "The Elements of Statistical Learning"
But here is a short answer to your problem. Say you have a linear model
awesomeness_predicted = slope * age
where slope
is the parameter you want to learn from data. Lets say you have data age[0], ..., age[N]
and the corresponding awesomeness values a_data[0],...,a_data[N]
. In order to specify if your model works well, we are going to use mean squared error, that is
error = sum((a_data[i] - a_predicted[i])**2 for i in range(N))
What you want to do now is start with a random guess for slope and gradually improving using gradient descent. Here is a full working example in pure tensorflow
import tensorflow as tf
import numpy as np
DTYPE = tf.float32
## Generate Data
age = np.array([67, 38, 32, 69, 40])
awesomeness = 10 * age
## Generate model
# define the parameter of the model
slope = tf.Variable(initial_value=tf.random_normal(shape=(1,), dtype=DTYPE))
# define the data inputs to the model as variable size tensors
x = tf.placeholder(DTYPE, shape=(None,))
y_data = tf.placeholder(DTYPE, shape=(None,))
# specify the model
y_pred = slope * x
# use mean squared error as loss function
loss = tf.reduce_mean(tf.square(y_data - y_pred))
target = tf.train.AdamOptimizer().minimize(loss)
## Train Model
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(100000):
_, training_loss = sess.run([target, loss],
feed_dict={x: age, y_data: awesomeness})
print("Training loss: ", training_loss)
print("Found slope=", sess.run(slope))
Upvotes: 1