Engine
Engine

Reputation: 5420

Tensorflow real example

I've a wrote a code based on this Tensorflow example . the issue that I'm having is that accuracy that I get doesn't make any sense ( it' either 1 or 0 ) so my question is what I'm missing here?

import tensorflow as tf
import  numpy as np
import  csv
import os


#defining  batch fuuntion

def batch(iterable, n=1):
    l = len(iterable)
    for ndx in range(0, l, n):
        yield iterable[ndx:min(ndx + n, l)]


os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
Training_File = 'Training.csv'
Test_File     = 'Test.csv'
numberOFClasses = 19
batchSize = 19

# read training data
filePointer  = open(Training_File, 'r', newline='')
reader = csv.reader(filePointer)
Training_Data   = []
Training_Labels = []
row = next(reader)
len(row)
#### Getting Training_Data and labels
for  row in reader:
    Training_Data.append(row[:-2])
    Training_Labels.append(row[-1])
# close TrainingFile  and  getting Data and labels from Test
len(Training_Data)
filePointer.close();

filePointer =open(Test_File, 'r', newline='')
reader  =   csv.reader(filePointer)
Test_Data = []
Test_Labels=[]
row = next(reader)

for row in reader:
    Test_Data.append(row[:-2])
    Test_Labels.append(row[-1])
len(Test_Labels)
filePointer.close()
len(Training_Data[0])



x = tf.placeholder('float',[None,len(row[:-2])])
w = tf.Variable(tf.zeros([len(row[:-2]),numberOFClasses]))
b = tf.Variable(tf.zeros([numberOFClasses]))
model = tf.add(tf.matmul(x,w),b)
y_ = tf.placeholder(tf.float32,[None,numberOFClasses])
y =  tf.nn.softmax(model)

cross_entropy= -tf.reduce_sum(y_*tf.log(y),reduction_indices=[1])
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
index =0
batch_xs = []
batch_ys = []
batch_txs= []
batch_tys= []
# Training processing


for i in batch(Training_Data,batchSize):
    batch_xs.append(i)
for i in batch(Training_Labels,batchSize):
    batch_ys.append(i)

for i in batch(Test_Data,batchSize):
    batch_txs.append(i)
for i in batch(Test_Labels,batchSize):
    batch_tys.append(i)


#print(np.reshape(batch_ys[len],(1,batchSize)))
for i in range(len(batch_xs) -1 ):
    sess.run(train_step,feed_dict={x:batch_xs[i],y_:np.reshape(batch_ys[i],(1,batchSize))})



correct_prediction = tf.equal(tf.arg_max(y,1),tf.arg_max(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,"float"))
for i in range(len(batch_txs) -1):
    print(sess.run(accuracy,feed_dict={x:batch_txs[i],y_:np.reshape(batch_tys[i],(1,batchSize))}))

UPDATE I've changed the size of the batches:

.............................................
numberOFClasses = 19

batchSize = 19 * 3
....................................
for i in range(int(len(batch_xs)/batchSize) ):
    print(sess.run(train_step,feed_dict={x:batch_xs[i],y_:np.reshape(batch_ys[i],(batchSize,numberOFClasses))}))



correct_prediction = tf.equal(tf.arg_max(y,1),tf.arg_max(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,"float"))
for i in range(len(batch_txs) -1):
    print(sess.run(accuracy,feed_dict={x:batch_txs[i],y_:np.reshape(batch_tys[i],(1,batchSize))}))

the result still the same, So I just don't get what I'm missing here

2ndUpdate

Running this part of the code : for j in range(len(batch_xs)-1): print(sess.run(train_step,feed_dict={x:batch_xs[j],y_:np.reshape(batch_ys[j],(numberOFClasses,3))}))

delivers a huge error message but I guess this part is relevant :

InvalidArgumentError (see above for traceback): Incompatible shapes: [19,3] vs. [57,19]
 [[Node: mul = Mul[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_Placeholder_1_0, Log)]]

So since my batch size is tree times the number of classes, I should get 57 predictions-> Y_ .

Shaping the feeding of Y_ [57,1]

for j in range(len(batch_xs)-1): print(sess.run(train_step,feed_dict={x:batch_xs[j],y_:np.reshape(batch_ys[j],(batchSize,1))}))

the print delivers None as return value but no error which is (I guess) ok.

But running the accuracy part delivers 1 and 0 as mentioned in the beginning.

the Test and Train Data and labels are 100% correct !

here is part of the end of CSV file :

enter image description here

Upvotes: 0

Views: 535

Answers (2)

gdelab
gdelab

Reputation: 6220

It can't be the source of your problem or errors, but I think all occurrences of row[:-2] should be replaced by row[:-1], if you want to take all indices but one (Python excludes the index end given in a range like row[begin:end])

You should have:

y_ = tf.placeholder(tf.float32,[None,numberOFClasses])
...
sess.run(train_step,feed_dict={x:batch_xs[i],y_:np.reshape(batch_ys[i],(batchSize, numberOFClasses ))})
...
print(sess.run(accuracy,feed_dict={x:batch_txs[i],y_:np.reshape(batch_tys[i],(batchSize, numberOFClasses ))}))

Anyway, you should definetly use batch_size != numberOFClasses, because it throws an error that you can use to understand what is wrong in your code. If you don't, you lose the exception message but the error is still there, hidden (you network still does not learn what you want). When you get the error look which reshapecauses a problem, and try to understand why (look what the shapes are and should be)

Upvotes: 1

etarion
etarion

Reputation: 17131

From your code sample it's impossible to tell exactly (at least batch() and batchSize are missing to be sure), but my guess is that you have batches of size one (whether intended or not), and so you get either an accuracy of one (if the sample was predicted correctly) or zero (if the sample was misclassified). For meaningful accuracies, you want to evaluate over larger batches.

Upvotes: 0

Related Questions