Pratik Kumar
Pratik Kumar

Adding text labels to confusion matrix in Tensorflow for Tensorboard

I am customizing the code from Tensorflow's example, to train on my own images by adding additional dense layers, dropouts, momentum Gradient descent etc.

I wanted to add a confusion matrix to Tensorboard so I followed the first answer (Jerod's) from this post (I had also tried the second answer but facing some debugging issues) and added a few lines to add_evaluation_step function. So now it looks like :

def add_evaluation_step(result_tensor, ground_truth_tensor):

  with tf.name_scope('accuracy'):
    with tf.name_scope('correct_prediction'):
      prediction = tf.argmax(result_tensor, 1)
      correct_prediction = tf.equal(
          prediction, tf.argmax(ground_truth_tensor, 1))
    with tf.name_scope('accuracy'):
      evaluation_step = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
  tf.summary.scalar('accuracy', evaluation_step)
  print('prediction shape :: {}'.format(ground_truth_tensor))

  #Add confusion matrix
  batch_confusion = tf.confusion_matrix(tf.argmax(ground_truth_tensor, 1), prediction,
  # Create an accumulator variable to hold the counts
  confusion = tf.Variable( tf.zeros([7,7], 
                                          dtype=tf.int32 ),
                                 name='confusion' )
  # Create the update op for doing a "+=" accumulation on the batch
  confusion_update = confusion.assign( confusion + batch_confusion )
  # Cast counts to float so tf.summary.image renormalizes to [0,255]
  confusion_image = tf.reshape( tf.cast( confusion_update, tf.float32),
                                 [1, 7, 7, 1])


  return evaluation_step, prediction

This gives me : confusion matrix

My question is how can I add the labels to the rows(actual class) and columns(Predicted class). To get something like :

required confusion matrix

Pratik Kumar
Pratik Kumar

Following MLNINJA's answer helped me to only only get the labels but also a beautiful live streamed visualization. Here's how I did it. First I wrote this function into

from textwrap import wrap
import itertools
import matplotlib
import tfplot
import os
import re

def plot_confusion_matrix(correct_labels, predict_labels,labels,session, title='Confusion matrix', tensor_name = 'MyFigure/image', normalize=False):
  conf = tf.contrib.metrics.confusion_matrix(correct_labels, predict_labels)

  if normalize:
    cm = cm.astype('float')*10 / cm.sum(axis=1)[:, np.newaxis]
    cm = np.nan_to_num(cm, copy=True)
    cm = cm.astype('int')


  fig = matplotlib.figure.Figure(figsize=(7, 7), dpi=320, facecolor='w', edgecolor='k')
  ax = fig.add_subplot(1, 1, 1)
  im = ax.imshow(cm, cmap='Oranges')

  classes = [re.sub(r'([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))', r'\1 ', x) for x in labels]
  classes = ['\n'.join(wrap(l, 40)) for l in classes]

  tick_marks = np.arange(len(classes))

  ax.set_xlabel('Predicted', fontsize=7)
  c = ax.set_xticklabels(classes, fontsize=10, rotation=-90,  ha='center')

  ax.set_ylabel('True Label', fontsize=7)
  ax.set_yticklabels(classes, fontsize=10, va ='center')

  for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
    ax.text(j, i, format(cm[i, j], 'd') if cm[i,j]!=0 else '.', horizontalalignment="center", fontsize=6, verticalalignment='center', color= "black")

  summary = tfplot.figure.to_summary(fig, tag=tensor_name)
  return summary

In my version of main function, first a summary writer conf__writer at line 1227 for the confusion matrix is created. Then the function is called with in the if(line 1261) clause that is invoked(at line 1287) for every evaluation step and finally the summary is written into the summary directory at line 1288.

Note: also the add_evaluation_step function has been modified to return the tensor for ground truth inputs. In line 1278 this is run to get the array of ground truth inputs which is fed to the plot_confusion_matrix function.

Jerod's answer contains almost everything you need, along for instance yauheni_selivonchyk's other answer on how to add custom images to Tensorboard.

It is then only a matter of putting everything together, i.e.:

  1. Implementing methods to pass plotted images to summaries (as RGB arrays)
  2. Implementing a method to convert the matrix data into a prettified confusion image
  3. Defining your running evaluation operations to obtain the confusion matrix data (along other metrics) and preparing a placeholder and summary to receive the plotted image
  4. Using everything together

1. Implementing methods to pass plotted images to summaries

import matplotlib
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np
import tensorflow as tf

# Inspired by yauheni_selivonchyk on SO (

def get_figure(figsize=(10, 10), dpi=300):
    Return a pyplot figure
    :param figsize:
    :param dpi:
    fig = plt.figure(num=0, figsize=figsize, dpi=dpi)
    return fig

def fig_to_rgb_array(fig, expand=True):
    Convert figure into a RGB array
    :param fig:         PyPlot Figure
    :param expand:      Flag to expand
    :return:            RGB array
    buf = fig.canvas.tostring_rgb()
    ncols, nrows = fig.canvas.get_width_height()
    shape = (nrows, ncols, 3) if not expand else (1, nrows, ncols, 3)
    return np.fromstring(buf, dtype=np.uint8).reshape(shape)

def figure_to_summary(fig, summary, place_holder):
    Convert figure into TF summary
    :param fig:             Figure
    :param summary:         Summary to eval
    :param place_holder:    Summary image placeholder
    :return:                Summary
    image = fig_to_rgb_array(fig)
    return summary.eval(feed_dict={place_holder: image})

2. Converting matrix data into a prettified confusion image

(here is an example, but it's up to what you want)

def confusion_matrix_to_image_summary(confusion_matrix, summary, place_holder, 
                                      list_classes, figsize=(9, 9)):
    Plot confusion matrix and return as TF summary
    :param matrix:          Confusion matrix (N x N)
    :param filename:        Filename
    :param list_classes:    List of classes (N)
    :param figsize:         Pyplot figsize for the confusion image
    :return:                /
    fig = get_figure(figsize=(9, 9))
    df = pd.DataFrame(confusion_matrix, index=list_classes, columns=list_classes)
    ax = sns.heatmap(df, annot=True, fmt='.0%')
    # Whatever embellishments you want:
    plt.title('Confusion matrix')
    image_sum = figure_to_summary(fig, summary, place_holder)
    return image_sum

3. Defining your evaluation operations & Preparing placeholder

# Inspired by Jerod's answer on SO (    
def add_evaluation_step(result_tensor, ground_truth_tensor, num_classes, confusion_matrix_figsize=(9, 9)):
    Sets up the evaluation operations, computing the running accuracy and confusion image
    :param result_tensor:               Output tensor
    :param ground_truth_tensor:         Target class tensor
    :param num_classes:                 Number of classes
    :param confusion_matrix_figsize:    Pyplot figsize for the confusion image
    :return:                            TF operations, summaries and placeholders (see usage below)
    scope = "evaluation"
    with tf.name_scope(scope):
        predictions = tf.argmax(result_tensor, 1, name="prediction")

        # Streaming accuracy (lookup and update tensors):
        accuracy, accuracy_update = tf.metrics.accuracy(ground_truth_tensor, predictions, name='accuracy')
        # Per-batch confusion matrix:
        batch_confusion = tf.confusion_matrix(ground_truth_tensor, predictions, num_classes=num_classes,

        # Aggregated confusion matrix:
        confusion_matrix = tf.Variable(tf.zeros([num_classes, num_classes], dtype=tf.int32),
        confusion_update = confusion_matrix.assign(confusion_matrix + batch_confusion)

        # We suppose each batch contains a complete class, to directly normalize by its size:
        evaluate_streaming_metrics_op =, confusion_update)

        # Confusion image from matrix (need to extend dims + cast to float so tf.summary.image renormalizes to [0,255]):
        confusion_image = tf.reshape(tf.cast(confusion_update, tf.float32), [1, num_classes, num_classes, 1])

        # Summaries:
        tf.summary.scalar('accuracy', accuracy, collections=[scope])
        summary_op = tf.summary.merge_all(scope)

        # Preparing placeholder for confusion image (so that we can pass the plotted image to it):
        #      (we basically pre-allocate a plot figure and pass its RGB array to a placeholder)
        confusion_image_placeholder = tf.placeholder(tf.uint8,
        confusion_image_summary = tf.summary.image('confusion_image', confusion_image_placeholder)

    # Isolating all the variables stored by the metric operations:
    running_vars = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope=scope)
    running_vars += tf.get_collection(tf.GraphKeys.LOCAL_VARIABLES, scope=scope)

    # Initializer op to start/reset running variables
    reset_streaming_metrics_op = tf.variables_initializer(var_list=running_vars)

    return evaluate_streaming_metrics_op, reset_streaming_metrics_op, summary_op, confusion_image_summary, \
           confusion_image_placeholder, confusion_image

4. Putting everything together

A quick example how to use this, though it needs to be adapted to your training procedure, etc.

classes = ["obj1", "obj2", "obj3"]
num_classes = len(classes)
model = your_network(...)

evaluate_streaming_metrics_op, reset_streaming_metrics_op, summary_op,
confusion_image_summary,  confusion_image_placeholder, confusion_image = \
add_evaluation_step(model.output,, num_classes)

def evaluate(session, model, eval_data_gen):
    Evaluate the model
    :param session:         TF session
    :param eval_data_gen:   Data to evaluate on
    :return:                Evaluation summaries for Tensorboard
    # Resetting streaming vars:

    # Evaluating running ops over complete eval dataset, e.g.:
    for batch in eval_data_gen:
        feed_dict = {model.inputs: batch}, feed_dict=feed_dict)

    # Obtaining the final results:
    summary_str, confusion_results =[summary_op, confusion_image])

    # Converting confusion data into plot into summary:
    confusion_img_str = confusion_matrix_to_image_summary(
        confusion_results[0,:,:,0], confusion_image_summary, confusion_image_placeholder, classes)
    summary_str += confusion_img_str

    return summary_str # to be given to a SummaryWriter

