Tensorflow 2d Histogram

I'm trying to create a 2D histogram in tensorflow to use in a custom loss function in tensorflow. More generally, I think people could benefit from using co-activations of neurons, which requires a similar structure.

Here's specifically what I'm trying to do:

Given a Nx2 Tensor, where N is some number of samples, I'd like to create a (binned) histogram of co-activations. For example, in the simple case of input=[[0, 0.01], [0, 0.99], [0.5, 0.5]] and with 10000 bins total, I'd like to generate a 100x100 Tensor with all 0s except for 3 entries at (0, 0.01), (0, 0.99), and (0.5, 0.5), where the value would be 1/3 (scaling is easy, so I'd be fine with 1 instead).

I could do this easily using standard numpy or array operations

neuron1 = data[:, 1]
neuron2 = data[:, 2]

hist_2d = np.zeros((100, 100))

for neuron1_output in neuron1:
    for neuron2_output in neuron2:
        hist_2d[int(100 * neuron1_output), int(100 * neuron2_output)] += 1

If I want to use hist_2d as part of a loss function in Tensorflow, though, it looks like I can't do this kind of iterating.

Does anyone know of a good way to generate the 2d histogram I'm looking for? I was happy to find tf.histogram_fixed_width(), but that only generates 1d histograms. I've started looking into tf.while_loop() and tf.map_fn(), but I'm pretty new to Tensorflow, so I'm not sure what avenue is most promising.

Upvotes: 1

Answers (2)

Isentropic

Reputation: 21

Maybe this snipped would help you.

import tensorflow as tf
@tf.function
def get2dHistogram(x, y,
                   value_range,
                   nbins=100,
                   dtype=tf.dtypes.int32):
    """
    Bins x, y coordinates of points onto simple square 2d histogram
    
    Given the tensor x and y:
    x: x coordinates of points
    y: y coordinates of points
    this operation returns a rank 2 `Tensor` 
    representing the indices of a histogram into which each element
    of `values` would be binned. The bins are equal width and
    determined by the arguments `value_range` and `nbins`.
    
    
  Args:
    x:  Numeric `Tensor`.
    y: Numeric `Tensor`.
    value_range[0] lims for x
    value_range[1] lims for y
    
    nbins:  Scalar `int32 Tensor`.  Number of histogram bins.
    dtype:  dtype for returned histogram.
  
    
    
    """
    x_range = value_range[0]
    y_range = value_range[1]

    histy_bins = tf.histogram_fixed_width_bins(y, y_range, nbins=nbins, dtype=dtype)
    
    H = tf.map_fn(lambda i: tf.histogram_fixed_width(x[histy_bins == i], x_range, nbins=nbins),
              tf.range(nbins))
    return H # Matrix!

Written in tensorflow 2.0, but you surely can manage it.

Upvotes: 1

LoudFlamingo

Reputation: 45

Posting an "answer", more like a workaround, that I figured out.

The whole reason I wanted to create the 2D histogram is that I wanted to calculate the entropy of the joint distribution of the activations of two neurons. I'm already discretizing the values of the activations into bins, so it's ok if I shuffle the distribution around as that won't modify the entropy value.

Given that, here's what I did: I created a 1D histogram with a squared number of bins and simply slid the values so that the first half of the digits corresponded to activations for neuron1 and the second half for activations for neuron2. In python:

# Calculate the entropy of a 1D tensor, fuzzing the edges with epsilon to keep numbers
# clean.
def calculate_entropy(y, epsilon):
    clipped = tf.clip_by_value(y, epsilon, 1 - epsilon)
    return -tf.cast(tf.reduce_sum(clipped * tf.log(clipped)), dtype=tf.float32)


# Sandbox for developing calculating the entropies of y
def tf_entropies(y, epsilon, nbins):
    # Create histograms for the activations in the batch.
    value_range = [0.0, 1.0]
    # For prototype, only consider first two features.
    neuron1 = y[:, 0]
    neuron2 = y[:, 1]
    hist1 = tf.histogram_fixed_width(neuron1, value_range, nbins=nbins)
    hist2 = tf.histogram_fixed_width(neuron2, value_range, nbins=nbins)
    # Normalize
    count = tf.cast(tf.count_nonzero(hist1), tf.int32)
    dist1 = tf.divide(hist1, count)
    dist2 = tf.divide(hist2, count)
    neuron1_entropy = calculate_entropy(dist1, epsilon)
    neuron2_entropy = calculate_entropy(dist2, epsilon)

    # Calculate the joint distribution and then get the entropy
    recast_n1 = tf.cast(tf.divide(tf.cast(nbins * neuron1, tf.int32), nbins), tf.float32)
    meshed = recast_n1 + tf.divide(neuron2, nbins)  # Shift over the numbers for neuron2
    joint_hist = tf.histogram_fixed_width(meshed, value_range, nbins=nbins * nbins)
    joint_dist = tf.divide(joint_hist, count)
    joint_entropy = calculate_entropy(joint_dist, epsilon)

    return neuron1_entropy, neuron2_entropy, joint_entropy, joint_dist

Once I've got the joint histogram, I can get the joint entropy using normal procedures. I validated that I got the correct result by implementing the same logic using normal numpy operations. The entropy calculations match.

I hope this helps others if they run into similar problems.

Upvotes: 2

Tensorflow 2d Histogram

Answers (2)

Related Questions