anonymous_instrument_
anonymous_instrument_

Reputation: 11

Understanding how to train my tensorflow CNN, line-by-line

Could someone help me understand what each line in the code below is doing? I am new to tensorflow and very confused.

for epoch in range(training_epochs):
    avg_cost = 0
    total_batch = int(mnist.train.num_examples / batch_size)

    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        c, _ = m1.train(batch_xs, batch_ys)
        avg_cost += c / total_batch

    print('Epoch:', '%04d' % (epoch + 1), 'cost =', 
'{:.9f}'.format(avg_cost))

I have defined the number of epochs. The code inside the for-loop trains the model by the number of iterations specified by training_epochs.

  1. I don't understand what batch_xs and batch_ys are and why the result of mnist.train.next_batch(batch_size) returns two values that are each defined as batch_xsandbatch_ys.
  2. c, _ = m1.train(batch_xs, batch_ys). c is the cost, but what is the underscore?
  3. Why is the cost modified by c / total_batch instead of c for each iteration?

Please help me understand.

Upvotes: 0

Views: 85

Answers (2)

fractals
fractals

Reputation: 856

I'm assuming you're using hunkim's tutorial.

  1. mnist.train.next_batch(batch_size) basically returns the next batch to train. This probably would be MNIST CNN/DNN, which means you need the input and the label to feed into the network, namely the image of the handwritten digit (input) and what the correct prediction (label) should be. These respectively are batch_xs and batch_ys.

  2. m1.train returns two objects, and the first is the cost of the step. _ is a variable name conventionally used for something that you won't use.

  3. avg_cost is printed only after an entire epoch is run. This suggests that avg_cost is the average cost of the epoch. Now, c is the average cost of one step. The total number of steps per epoch is total_batch. Then to compute the average cost of one epoch, you should add all the cs returned every step and divide that by total_batch. This is what's essentially being done; avg_cost += c / total_batch.

Upvotes: 3

CIsForCookies
CIsForCookies

Reputation: 12837

  1. For your 1st Q, you should probably include a bit more code. Without seeing the rest of the code, I can guess that batch_xs, batch_ys are data, labels.
  2. _ usually means an unwanted variable, meaning the function returns 2 objects, but you only want the first, so you denote the desired one in a meaningful name (c) and the other in a less so (_).
  3. c / total_batch is exactly what you want. It is the proportional cost.

Upvotes: 1

Related Questions