Afshin Oroojlooy
Afshin Oroojlooy

Reputation: 1434

Accuracy issue with multiple output in tensorflow

I have a system which may has 65 different property in a one moment. I want to predict them using a dnn. My inputs are the properties of the system (79 binary inputs) and the output is 65 discrete statuses, which can be 0,1,2. So, for each input I may have a vector of output like [0,0,1,2,...,2,1,0,1,1]. In order to use the dnn algorithm, I wanted to have 65 softmax outputs, which each has three output. So, the output of dnn is vector y of size [,65*3].

I implemented this problem with a fully connected network in tensorflow. However, I have problem in obtaining the accuracy of each solution. For each of the samples and for a given y_, the accuracy can be obtained with:

correct_predct = tf.reduce_sum(tf.cast([tf.equal( tf.argmax(y_[:,i*3:(i+1)*3],0) , tf.argmax(y[:,i*3:(i+1)*3],0)) for i in range(65)],tf.float32))

accuracy = tf.reduce_mean(tf.scalar_mul(1/65.0,correct_predct))

However, it does not work because the way that y_ and y are defined.

Here is my code:

import tensorflow as tf
import numpy as np
import scipy.io as sio
from tensorflow.python.training import queue_runner
tf.logging.set_verbosity(tf.logging.FATAL)

sess = tf.InteractiveSession()

maxiter = 50000
display = 100
decay_rate = 0.9
starter_learning_rate = 0.001
power = 0.75
l2lambda = .01
init_momentum = 0.9
decay_step = 5000

nnodes1 = 350
nnodes2 = 100
batch_size = 50
var = 2.0/(67+195)

print decay_rate,starter_learning_rate,power,l2lambda,init_momentum,decay_step,nnodes1,nnodes2,batch_size

result_mat = sio.loadmat('binarySysFuncProf.mat')
feature_mat = sio.loadmat('binaryDurSamples.mat')

result = result_mat['binarySysFuncProf']
feature = feature_mat['binaryDurSamples']

train_size=750000
test_size=250000
train_feature = feature[0:train_size,:]
train_output = result[0:train_size,:]

test_feature = feature[train_size + 1 : train_size + test_size , :]
test_output = result[train_size + 1 : train_size + test_size , :]

# import the data
#from tensorflow.examples.tutorials.mnist import input_data
# placeholders, which are the training data
x = tf.placeholder(tf.float64, shape=[None,79])
y_ = tf.placeholder(tf.float64, shape=[None,195])
learning_rate = tf.placeholder(tf.float64, shape=[])

# define the variables
W1 = tf.Variable(np.random.normal(0,var,(79,nnodes1)))
b1 = tf.Variable(np.random.normal(0,var,nnodes1))

W2 = tf.Variable(np.random.normal(0,var,(nnodes1,nnodes2)))
b2 = tf.Variable(np.random.normal(0,var,nnodes2))

W3 = tf.Variable(np.random.normal(0,var,(nnodes2,1)))
b3 = tf.Variable(np.random.normal(0,var,1))

# Passing global_step to minimize() will increment it at each step.
global_step = tf.Variable(0, trainable=False)
momentum = tf.Variable(init_momentum, trainable=False)

# prediction function (just one layer)                                                                                                          
layer1 = tf.nn.sigmoid(tf.matmul(x,W1) + b1)
layer2 = tf.nn.sigmoid(tf.matmul(layer1,W2) + b2)
y = tf.nn.softmax(tf.matmul(layer2,W3) + b3)

cost_function =tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y)))
#cost_function = tf.sum([tf.reduce_mean((-tf.reduce_sum(y_[:,i*3:(i+1)*3])*tf.log(y[:,i*3:(i+1)*3]))) for i in range(65)])
correct_predct = tf.reduce_sum(tf.cast([tf.equal( tf.argmax(y_[:,i*3:(i+1)*3],0) , tf.argmax(y[:,i*3:(i+1)*3],0)) for i in range(65)],tf.float32))
accuracy = tf.reduce_mean(tf.scalar_mul(1/65.0,correct_predct))

l2regularization = tf.reduce_sum(tf.square(W1)) + tf.reduce_sum(tf.square(b1)) + tf.reduce_sum(tf.square(W2)) + tf.reduce_sum(tf.square(b2)) + tf.reduce_sum(tf.square(W3)) + tf.reduce_sum(tf.square(b3))

loss = (cost_function) + l2lambda*l2regularization

# define the learning_rate and its decaying procedure.
learning_rate = tf.train.exponential_decay(starter_learning_rate, global_step,decay_step, decay_rate, staircase=True)

# define the training paramters and model, gradient model and feeding the function
#train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
train_step = tf.train.MomentumOptimizer(learning_rate,0.9).minimize(loss, global_step=global_step)

# initilize the variables                                                                                                                       
sess.run(tf.initialize_all_variables())

# Train the Model for 1000 times. by defining the batch number we determine that it is sgd
for i in range(maxiter):
  batch = np.random.randint(0,train_size,size=batch_size)
  train_step.run(feed_dict={x:train_feature[batch,:], y_:train_output[batch,:]})  
  if np.mod(i,display) == 0:
    train_loss = cost_function.eval(feed_dict={x: train_feature[0:train_size,:], y_: train_output[0:train_size,:]})
    test_loss = cost_function.eval(feed_dict={x: test_feature, y_: test_output})
    train_acc = 0#accuracy.eval(feed_dict={x: train_feature[0:train_size,:], y_: train_output[0:train_size,:]})
    test_acc = 0#accuracy.eval(feed_dict={x: test_feature, y_: test_output})
    print "Iter" , i, "lr" , learning_rate.eval() , "| Train loss" , train_loss , "| Test loss", test_loss  , "| Train Accu", train_acc , "| Test Accu", test_acc  ,"||W||",l2regularization.eval() , "lmbd*||W||", l2lambda*l2regularization.eval()

Since I have y_ and y as a matrix with column size 195, I am getting the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 555, in eval
    return _eval_using_default_session(self, feed_dict, self.graph, session)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 3498, in _eval_using_default_session
    return session.run(tensors, feed_dict)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 372, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 625, in _run
    % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
ValueError: Cannot feed value of shape (750000, 195) for Tensor u'Placeholder_7:0', which has shape '(?, 3)'

I appreciate any comment or help to obtain the accuracy.

Afshin

Upvotes: 0

Views: 900

Answers (1)

Simon. Li
Simon. Li

Reputation: 429

Do you mean that you have 65 classes which are NOT mutually exclusive?

If so, you're doing multi-label classification, wiki: https://en.wikipedia.org/wiki/Multi-label_classification

To train a multi-label classifier. You need to do one-hot encode of your classes. Let's say you will have an output tensor which has shape (-1, 65), and value 1 means the class should be predicted.

Your output layer should be "sigmoid" and use something like "binary_crossentropy" as loss function.

Upvotes: 1

Related Questions