Reputation: 30677
When I print out the predictions, the output includes 3 separate classes 0, 1, and 2
but I only give it 2 separate classes in the training set 0 and 1
. I'm not sure why this is happening. I'm trying to elaborate on a tutorial from TensorFlow Machine Learning Cookbook. This is based on the last example of Chapter 2 if anyone has access to it. Note, there are some errors but that may be incompatibility between the older version from the text.
Anyways, I am trying to develop a very rigid structure when building my models so I can get it engrained in muscle memory. I am instantiating the tf.Graph
before-hand for each tf.Session
of a set of computations and also setting the number of threads to use. Note, I am using TensorFlow 1.0.1
with Python 3.6.1
so the f"formatstring{var}"
won't work if you have an older version of Python.
Where I am getting confused is the last step in the prediction under # Accuracy Predictions
section. Why am I getting 3 classes for my classification and why is my accuracy so poor for such a simple classification? I am fairly new at this type of model-based machine learning so I'm sure it's some syntax error or assumption I have made. Is there an error in my code?
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import multiprocessing
# Set number of CPU to use
tf_max_threads = tf.ConfigProto(intra_op_parallelism_threads=multiprocessing.cpu_count())
# Data
seed= 0
size = 50
x = np.concatenate((np.random.RandomState(seed).normal(-1,1,size),
np.random.RandomState(seed).normal(2,1,size)
)
)
y = np.concatenate((np.repeat(0, size),
np.repeat(1, size)
)
)
# Containers
loss_data = list()
A_data = list()
# Graph
G_6 = tf.Graph()
n = 25
# Containers
loss_data = list()
A_data = list()
# Iterations
n_iter = 5000
# Train / Test Set
tr_ratio = 0.8
tr_idx = np.random.RandomState(seed).choice(x.size, round(tr_ratio*x.size), replace=False)
te_idx = np.array(list(set(range(x.size)) - set(tr_idx)))
# Build Graph
with G_6.as_default():
# Placeholders
pH_x = tf.placeholder(tf.float32, shape=[None,1], name="pH_x")
pH_y_hat = tf.placeholder(tf.float32, shape=[None,1], name="pH_y_hat")
# Train Set
x_train = x[tr_idx].reshape(-1,1)
y_train = y[tr_idx].reshape(-1,1)
# Test Set
x_test= x[te_idx].reshape(-1,1)
y_test = y[te_idx].reshape(-1,1)
# Model
A = tf.Variable(tf.random_normal(mean=10, stddev=1, shape=[1], seed=seed), name="A")
model = tf.multiply(pH_x, A)
# Loss
loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=model, labels=pH_y_hat))
with tf.Session(graph=G_6, config=tf_max_threads) as sess:
sess.run(tf.global_variables_initializer())
# Optimizer
op = tf.train.GradientDescentOptimizer(0.03)
train_step = op.minimize(loss)
# Train linear model
for i in range(n_iter):
idx_random = np.random.RandomState(i).choice(x_train.size, size=n)
x_tr = x[idx_random].reshape(-1,1)
y_tr = y[idx_random].reshape(-1,1)
sess.run(train_step, feed_dict={pH_x:x_tr, pH_y_hat:y_tr})
# Iterations
A_iter = sess.run(A)[0]
loss_iter = sess.run(loss, feed_dict={pH_x:x_tr, pH_y_hat:y_tr}).mean()
# Append
loss_data.append(loss_iter)
A_data.append(A_iter)
# Log
if (i + 1) % 1000 == 0:
print(f"Step #{i + 1}:\tA = {A_iter}", f"Loss = {to_precision(loss_iter)}", sep="\t")
print()
# Accuracy Predictions
A_result = sess.run(A)
y_ = tf.squeeze(tf.round(tf.nn.sigmoid_cross_entropy_with_logits(logits=model, labels=pH_y_hat)))
correct_predictions = tf.equal(y_, pH_y_hat)
accuracy = tf.reduce_mean(tf.cast(correct_predictions, tf.float32))
print(sess.run(y_, feed_dict={pH_x:x_train, pH_y_hat:y_train}))
print("Training:",
f"Accuracy = {sess.run(accuracy, feed_dict={pH_x:x_train, pH_y_hat:y_train})}",
f"Shape = {x_train.shape}", sep="\t")
print("Testing:",
f"Accuracy = {sess.run(accuracy, feed_dict={pH_x:x_test, pH_y_hat:y_test})}",
f"Shape = {x_test.shape}", sep="\t")
# Plot path
with plt.style.context("seaborn-whitegrid"):
fig, ax = plt.subplots(nrows=3, figsize=(6,6))
pd.Series(loss_data,).plot(ax=ax[0], label="loss", legend=True)
pd.Series(A_data,).plot(ax=ax[1], color="red", label="A", legend=True)
ax[2].hist(x[:size], np.linspace(-5,5), label="class_0", color="red")
ax[2].hist(x[size:], np.linspace(-5,5), label="class_1", color="blue")
alphas = np.linspace(0,0.5, len(A_data))
for i in range(0, len(A_data), 100):
alpha = alphas[i]
a = A_data[i]
ax[2].axvline(a, alpha=alpha, linestyle="--", color="black")
ax[2].legend(loc="upper right")
fig.suptitle("training-process", fontsize=15, y=0.95)
Output Results:
Step #1000: A = 6.72 Loss = 1.13
Step #2000: A = 3.93 Loss = 0.58
Step #3000: A = 2.12 Loss = 0.319
Step #4000: A = 1.63 Loss = 0.331
Step #5000: A = 1.58 Loss = 0.222
[ 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 2.
0. 0. 2. 0. 2. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 2. 0. 0. 0. 0. 0. 0. 0. 1. 0.
1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.
0. 0. 0. 0. 0. 0. 0. 0.]
Training: Accuracy = 0.475 Shape = (80, 1)
Testing: Accuracy = 0.5 Shape = (20, 1)
Upvotes: 0
Views: 1169
Reputation: 3325
You have a linear regression model, i.e., your output variable (model = tf.multiply(pH_x, A)) outputs for each input a single scalar value with an arbitrary range. That's generally what you'd have for a prediction model, one that needs to predict some numeric value, not for a classifier.
Afterwards, you treat it like it would contain a typical n-ary classifier output (e.g. by passing it sigmoid_cross_entropy_with_logits) but it does not match the expectations of that function - in that case, the 'shape' of the model variable should be multiple values (e.g. 2 in your case) per input datapoint, each corresponding to some metric corresponding to the probability for each class; then often passed to a softmax function to normalize them.
Alternatively, you may want a binary classifier model that outputs a single value 0 or 1 depending on the class - however, in that case, you want something like the logistic function after the matrix multiplication; and that would need a different loss function, something like simple mean square difference, not sigmoid_cross_entropy_with_logits.
Currently the model as written seems like a mash of two different, incompatible tutorials.
Upvotes: 2