Reputation: 19
How to implement N-hot encoding according to the index of 1 in a tf.int64 ? The input is tensor containing several tf.int64. The N-hot encoding is aimed to replace one-hot encoding in tf.slim.
The one_hot encoding is implemented as following:
def dense_to_one_hot(labels_dense, num_classes):
"""Convert class labels from scalars to one-hot vectors."""
num_labels = labels_dense.shape[0]
index_offset = numpy.arange(num_labels) * num_classes
labels_one_hot = numpy.zeros((num_labels, num_classes))
labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1
return labels_one_hot
The N-not encoding means: 19=00010011, the result after encoding is [0,0,0,1,0,0,1,1].
Upvotes: 1
Views: 386
Reputation: 15119
Find below an alternative to @jdehesa great answer. This version computes the bit length N
itself (but works on single-valued tensors only - or tensors containing values of same bit length):
import tensorflow as tf
def logn(x, n):
numerator = tf.log(x)
denominator = tf.log(tf.cast(n, dtype=numerator.dtype))
return numerator / denominator
def count_bits(x):
return tf.cast((logn(tf.cast(x, dtype=tf.float32), 2)) + 1, dtype=x.dtype)
def n_hot_encode(x):
"""
Unpack an integer into its variable-length bit representation
:param x: Int tensor of shape ()
:return: Bool tensor of shape (N,) with N = bit length of x
"""
N = count_bits(x)
bins = tf.bitwise.left_shift(1, tf.range(N))[::-1]
x_unpacked = tf.reshape(tf.bitwise.bitwise_and(x, bins), [-1])
x_bits = tf.cast(x_unpacked, dtype=tf.bool)
return x_bits
with tf.Session() as sess:
result = sess.run(n_hot_encode(tf.constant(19)))
print(result)
# > [ True False False True True]
result = sess.run(n_hot_encode(tf.constant(255)))
print(result)
# > [ True True True True True True True True]
Using tf.one_hot()
:
labels_one_hot = tf.one_hot(labels_dense, num_classes)
Upvotes: 2
Reputation: 59701
This is one solution:
import numpy as np
import tensorflow as tf
def n_hot_encoding(a, n):
a = tf.convert_to_tensor(a)
m = 1 << np.arange(n)[::-1]
shape = np.r_[np.ones(len(a.shape), dtype=int), -1]
m = m.reshape(shape)
hits = tf.bitwise.bitwise_and(a[..., tf.newaxis], tf.cast(m, a.dtype))
return tf.not_equal(hits, 0)
with tf.Graph().as_default(), tf.Session() as sess:
n_hot = n_hot_encoding([19, 20, 21], 10)
print(sess.run(tf.cast(n_hot, tf.int32)))
Output:
[[0 0 0 0 0 1 0 0 1 1]
[0 0 0 0 0 1 0 1 0 0]
[0 0 0 0 0 1 0 1 0 1]]
It assumes that the N
is a regular scalar (not a TensorFlow value) and that the number of dimensions of the array to convert is known (the size of each dimension can be dynamic, but a.shape
should not be just None
). The function can be adapted to TensorFlow-only computation like this:
import tensorflow as tf
def n_hot_encoding(a, n):
a = tf.convert_to_tensor(a)
n = tf.convert_to_tensor(n)
m = tf.bitwise.left_shift(1, tf.range(n)[::-1])
shape = tf.concat([tf.ones([tf.rank(a)], dtype=tf.int64), [-1]], axis=0)
m = tf.reshape(m, shape)
hits = tf.bitwise.bitwise_and(a[..., tf.newaxis], tf.cast(m, a.dtype))
return tf.not_equal(hits, 0)
This should work with any input but may do a bit more of extra work on every graph run.
Upvotes: 2