TensorBoard doesn't update scalars for TensorFlow Slim

Question

TensorBoard runs, and I can see plots, however the Losses do not appear to update. I can print the batch losses at each step, I am uncertain why TensorBoard is not reflecting the losses.

I am attempting to learn about TensorFlow Slim, following this example:

Attempted

I attempted to add a FileWriter, even though the tutorial does not appear to have one. Does TensorFlow-Slim still require an explicit FileWriter? I see that there is a summary_writer parameter in the slim.learning.train, but is it required or does tf.summary.scalar get set as the default? Regardless, it does not appear to impact the graphs
Setting the trace_every_n_steps to various values (1, 2, 5)
I have also deleted and regenerated the generated points (checkpoint, events.out.fevents. , graph.pbtxt, model.ckpt-0.data / index/meta, etc.)

CODE

My code is largely from the Fine-tune the model on a different set of labels in the tutorial.

Explicitly:

import os

from preprocessing import inception_preprocessing
import numpy as np
import tensorflow as tf
from datasets import flowers
from nets import inception

from tensorflow.contrib import slim
import matplotlib.pyplot as plt


image_size = inception.inception_v1.default_image_size
checkpoints_dir = './tmp/checkpoints'
flowers_data_dir = './tmp/data/tf_records'


if not tf.gfile.Exists(checkpoints_dir):
    tf.gfile.MakeDirs(checkpoints_dir)

def load_batch(dataset, batch_size=32, height=299, width=299, is_training=False):
    """Loads a single batch of data.

    Args:
      dataset: The dataset to load.
      batch_size: The number of images in the batch.
      height: The size of each image after preprocessing.
      width: The size of each image after preprocessing.
      is_training: Whether or not we're currently training or evaluating.

    Returns:
      images: A Tensor of size [batch_size, height, width, 3], image samples that have been preprocessed.
      images_raw: A Tensor of size [batch_size, height, width, 3], image samples that can be used for visualization.
      labels: A Tensor of size [batch_size], whose values range between 0 and dataset.num_classes.
    """
    data_provider = slim.dataset_data_provider.DatasetDataProvider(
        dataset, common_queue_capacity=32,
        common_queue_min=8)
    image_raw, label = data_provider.get(['image', 'label'])

    # Preprocess image for usage by Inception.
    image = inception_preprocessing.preprocess_image(image_raw, height, width, is_training=is_training)

    # Preprocess the image for display purposes.
    image_raw = tf.expand_dims(image_raw, 0)
    image_raw = tf.image.resize_images(image_raw, [height, width])
    image_raw = tf.squeeze(image_raw)

    # Batch it up.
    images, images_raw, labels = tf.train.batch(
        [image, image_raw, label],
        batch_size=batch_size,
        num_threads=1,
        capacity=2 * batch_size)

    return images, images_raw, labels


def get_init_fn():
    """Returns a function run by the chief worker to warm-start the training."""
    checkpoint_exclude_scopes = ["InceptionV1/Logits", "InceptionV1/AuxLogits"]

    exclusions = [scope.strip() for scope in checkpoint_exclude_scopes]

    variables_to_restore = []
    for var in slim.get_model_variables():
        excluded = False
        for exclusion in exclusions:
            if var.op.name.startswith(exclusion):
                excluded = True
                break
        if not excluded:
            variables_to_restore.append(var)

    return slim.assign_from_checkpoint_fn(
        os.path.join(checkpoints_dir, 'inception_v1.ckpt'),
        variables_to_restore)


train_dir = './tmp/inception_finetuned/'

with tf.Graph().as_default():
    tf.logging.set_verbosity(tf.logging.INFO)
    dataset = flowers.get_split('train', flowers_data_dir)

    images, _, labels = load_batch(dataset, height=image_size, width=image_size)

    # Create the model, use the default arg scope to configure the batch norm parameters.
    with slim.arg_scope(inception.inception_v1_arg_scope()):
        logits, _ = inception.inception_v1(images, num_classes=dataset.num_classes, is_training=True)

    # Specify the loss function:
    one_hot_labels = slim.one_hot_encoding(labels, dataset.num_classes)
    slim.losses.softmax_cross_entropy(logits, one_hot_labels)
    total_loss = slim.losses.get_total_loss()

    # Create some summaries to visualize the training process:
    tf.summary.scalar('losses/Total_Loss', total_loss)

    # Specify the optimizer and create the train op:
    optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
    train_op = slim.learning.create_train_op(total_loss, optimizer)

    train_writer = tf.summary.FileWriter(train_dir)
    # train_writer.add_summary()

    # Run the training:
    final_loss = slim.learning.train(
        train_op,
        logdir=train_dir,
        init_fn=get_init_fn(),
        number_of_steps=2,
        summary_writer=train_writer,
        trace_every_n_steps=2)

print('Finished training. Last batch loss %f' % final_loss)

Related

There are a large number of questions that are related to TensorBoard issues, however in these cases, TensorBoard doesn't run at all, shows nothing, or gives some kind of error. In my case, TensorBoard appears to runs without errors, and I can see the losses plot generated, it doesn't appear to grab more than one value.

TensorBoard doesn't update scalars for TensorFlow Slim

Answers (1)

Related Questions

TensorBoard doesn&#39;t update scalars for TensorFlow Slim

Answers (1)

Related Questions

TensorBoard doesn't update scalars for TensorFlow Slim