Reputation: 871
I have a biggish dataset stored relatively efficiently on disk, with one-hot vectors packed into the bits of a bunch of ints. The data format is fixed-width, so I can read it in fine with tf.data.FixedLengthRecordDataset
, and with tf.decode_raw()
and tf.bitwise.*
I have converted by input data into a pile of 64-bit integers representing the input vectors. But I am stumped at expanding the integer bit patterns into a tensor.
Concretely (using bytes instead of longs for the sake of brevity), let's say I get the value 0xba
(0b10111010
). In that case I want to expand this to the vector (1, 0, 1, 1, 1, 0, 1, 0)
. What is the best way to achieve this?
Upvotes: 1
Views: 635
Reputation: 59731
You can do that like this:
import tensorflow as tf
def bits_to_one_hot(bits, depth, dtype=None):
bits = tf.convert_to_tensor(bits)
masks = tf.bitwise.left_shift(tf.ones([], dtype=bits.dtype),
tf.range(depth, dtype=bits.dtype))
masked = tf.bitwise.bitwise_and(tf.expand_dims(bits, -1), masks)
dtype = dtype or bits.dtype
return tf.cast(tf.not_equal(masked, 0), dtype)
data = [0b10111010, 0b00101101]
depth = 8
input_bits = tf.placeholder(tf.int64, [None])
one_hot = bits_to_one_hot(input_bits, depth)
with tf.Session() as sess:
print(sess.run(one_hot, feed_dict={input_bits: data}))
Output:
[[0 1 0 1 1 1 0 1]
[1 0 1 1 0 1 0 0]]
Upvotes: 2