mstfa23
mstfa23

Reputation: 11

Making batch tensorflow

So I have this problem of making batch in my code, the thing is, I tried to search how we do batching but all I found was using some method like next_batch in MNIST sample program. I would really appreciate if someone could actually give me some tips on how I should make batch in my program below.

import tensorflow as tf
import numpy as np
from sklearn import cross_validation
import pandas as pd
np.random.seed(20160612)
tf.set_random_seed(20160612)

#this is input data, data is a 7x86594 and label is a 5x86594
data2 = pd.read_csv('rawdata.csv', sep=',', header=None) 
data = np.array(data2)
label2=pd.read_csv('class.csv', sep='\t', header=None)
label=np.array(label2)

train_x,test_x,train_t,test_t=cross_validation.train_test_split(data,label,test_size=0.1,random_state=None)

#this is supposed to be neural size in hidden layer
num_units = 15

x = tf.placeholder(tf.float32, [None, 7])
t = tf.placeholder(tf.float32, [None, 5])

w1 = tf.Variable(tf.truncated_normal([7, num_units], mean=0.0, stddev=0.05))
b1 = tf.Variable(tf.zeros([num_units]))
hidden1 = tf.nn.relu(tf.matmul(x, w1) + b1)

w0 = tf.Variable(tf.zeros([num_units, 5]))
b0 = tf.Variable(tf.zeros([5]))

p = tf.nn.softmax(tf.matmul(hidden1, w0) + b0)


loss =  -tf.reduce_sum(t * tf.log(tf.clip_by_value(p,1e-10,1.0)))
train_step = tf.train.AdamOptimizer().minimize(loss)
correct_prediction = tf.equal(tf.argmax(p, 1), tf.argmax(t, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())

#this is how i think batching is

batch_size = 100
for j in range(0, 86594, batch_size):
    xs,ys= train_x[j:j+batch_size],  train_t[j:j+batch_size]


i = 0

for _ in range(4000):
    i += 1

    sess.run(train_step, feed_dict={x: xs, t: ys})
    if i % 100 == 0:
        loss_val, acc_val = sess.run([loss, accuracy],feed_dict={x:test_x, t: test_t})
        print ('Step: %d, Loss: %f, Accuracy: %f'% (i, loss_val, acc_val))

The result of this program, of course, isn't right.

Upvotes: 1

Views: 5066

Answers (1)

Prasad
Prasad

Reputation: 6034

Keep extracting the batches of your data and keep feeding them to the network for training. In each epoch, all the samples of your training dataset should be run once. So you can rewrite your code like this:

Required part of code only:

epochs = 4000
batch_size = 100
for epoch_no in range(epochs):
    for index, offset in enumerate(range(0, 86594, batch_size)):
        xs, ys = train_x[offset: offset + batch_size], train_t[offset: offset + batch_size]
        sess.run(train_step, feed_dict={x: xs, t: ys})

        if index % 100 == 0:
            loss_val, acc_val = sess.run([loss, accuracy], feed_dict = {x: test_x, t: test_t})
            print ('Epoch %d, Step: %d, Loss: %f, Accuracy: %f'% (epoch_no, index, loss_val, acc_val))

Upvotes: 1

Related Questions