Dr. Andrew
Dr. Andrew

Reputation: 2621

Tensorflow Extracting Classification Predictions

I've a tensorflow NN model for classification of one-hot-encoded group labels (groups are exclusive), which ends with (layerActivs[-1] are the activations of the final layer):

probs = sess.run(tf.nn.softmax(layerActivs[-1]),...)
classes = sess.run(tf.round(probs))
preds = sess.run(tf.argmax(classes))

The tf.round is included to force any low probabilities to 0. If all probabilities are below 50% for an observation, this means that no class will be predicted. I.e., if there are 4 classes, we could have probs[0,:] = [0.2,0,0,0.4], so classes[0,:] = [0,0,0,0]; preds[0] = 0 follows.

Obviously this is ambiguous, as it is the same result that would occur if we had probs[1,:]=[.9,0,.1,0] -> classes[1,:] = [1,0,0,0] -> 1 preds[1] = 0. This is a problem when using the tensorflow builtin metrics class, as the functions can't distinguish between no prediction, and prediction in class 0. This is demonstrated by this code:

import numpy as np
import tensorflow as tf
import pandas as pd

''' prepare '''
classes = 6
n = 100

# simulate data
np.random.seed(42)
simY = np.random.randint(0,classes,n)     # pretend actual data
simYhat = np.random.randint(0,classes,n)  # pretend pred data
truth = np.sum(simY == simYhat)/n
tabulate = pd.Series(simY).value_counts()

# create placeholders
lab = tf.placeholder(shape=simY.shape, dtype=tf.int32)
prd = tf.placeholder(shape=simY.shape, dtype=tf.int32)
AM_lab = tf.placeholder(shape=simY.shape,dtype=tf.int32)
AM_prd = tf.placeholder(shape=simY.shape,dtype=tf.int32)

# create one-hot encoding objects
simYOH = tf.one_hot(lab,classes)

# create accuracy objects
acc = tf.metrics.accuracy(lab,prd)            # real accuracy with tf.metrics
accOHAM = tf.metrics.accuracy(AM_lab,AM_prd)  # OHE argmaxed to labels - expected to be correct

# now setup to pretend we ran a model & generated OHE predictions all unclassed
z = np.zeros(shape=(n,classes),dtype=float)
testPred = tf.constant(z)

''' run it all '''
# setup
sess = tf.Session()
sess.run([tf.global_variables_initializer(),tf.local_variables_initializer()])

# real accuracy with tf.metrics
ACC = sess.run(acc,feed_dict = {lab:simY,prd:simYhat})
# OHE argmaxed to labels - expected to be correct, but is it?
l,p = sess.run([simYOH,testPred],feed_dict={lab:simY})
p = np.argmax(p,axis=-1)
ACCOHAM = sess.run(accOHAM,feed_dict={AM_lab:simY,AM_prd:p})
sess.close()

''' print stuff '''
print('Accuracy')
print('-known truth: %0.4f'%truth)
print('-on unprocessed data: %0.4f'%ACC[1])
print('-on faked unclassed labels data (s.b. 0%%): %0.4f'%ACCOHAM[1])
print('----------\nTrue Class Freqs:\n%r'%(tabulate.sort_index()/n))

which has the output:

Accuracy
-known truth: 0.1500
-on unprocessed data: 0.1500
-on faked unclassed labels data (s.b. 0%): 0.1100
----------
True Class Freqs:
0    0.11
1    0.19
2    0.11
3    0.25
4    0.17
5    0.17
dtype: float64
Note freq for class 0 is same as faked accuracy...

I experimented with setting a value of preds to np.nan for observations with no predictions, but tf.metrics.accuracy throws ValueError: cannot convert float NaN to integer; also tried np.inf but got OverflowError: cannot convert float infinity to integer.

How can I convert the rounded probabilities to class predictions, but appropriately handle unpredicted observations?

Upvotes: 0

Views: 1876

Answers (1)

Dr. Andrew
Dr. Andrew

Reputation: 2621

This has gone long enough without an answer, so I'll post here as the answer my solution. I convert belonging probabilities to class predictions with a new function that has 3 main steps:

  1. set any NaN probabilities to 0
  2. set any probabilities below 1/num_classes to 0
  3. use np.argmax() to extract predicted classes, then set any unclassed observations to a uniformly selected class

The resultant vector of integer class labels can be passed to the tf.metrics functions. My function below:

def predFromProb(classProbs):
  '''
  Take in as input an (m x p) matrix of m observations' class probabilities in
  p classes and return an m-length vector of integer class labels (0...p-1). 
  Probabilities at or below 1/p are set to 0, as are NaNs; any unclassed
  observations are randomly assigned to a class.
  '''
  numClasses = classProbs.shape[1]
  # zero out class probs that are at or below chance, or NaN
  probs = classProbs.copy()
  probs[np.isnan(probs)] = 0
  probs = probs*(probs > 1/numClasses)
  # find any un-classed observations
  unpred = ~np.any(probs,axis=1)
  # get the predicted classes
  preds = np.argmax(probs,axis=1)
  # randomly classify un-classed observations
  rnds = np.random.randint(0,numClasses,np.sum(unpred))
  preds[unpred] = rnds

  return preds

Upvotes: 1

Related Questions