MariusSuvatne
MariusSuvatne

Reputation: 41

Tensorflow: Output where max is 1 and rest is 0

I am working on a special case of a convolutional network for solving puzzles, which partly will be reused to classify images later. Currently I am working on setting up the final layer of the puzzle part.

Each puzzle consists of 9 pieces, and y_hat is a 81(9x9) long nparray containing the original position of each piece.

For the network I use a Tensorflow functional model, where the final layer is 9 small sub models softmaxing to indicate where the piece should go. I was wondering if there is any way to make the last layer of the submodels only output 1 for the highest value after softmax and 0 for the rest? I have been searching for days now. This still have to be a part of the neural network, so it can be used when training.

What I mean is:

[0.01,0.41,0.02,0.32,0.01,0.43] => [0,0,0,0,0,1]

Upvotes: 1

Views: 808

Answers (2)

Wt.N
Wt.N

Reputation: 1658

If you just want to classify things, then use from_logit=True in tf.keras.losses.CategoricalCrossentropy[1] or tf.keras.losses.SparseCategoricalCrossentropy[2], and output tf.keras.layers.Dense layer. softmax calculation will be done in these loss functions.

By putting one_hot layer and softmax layer into one layer, computational efficiency gets better. You don't need to make your own one_hot layer for the output of your network. Just set

If you want to your own one_hot vector converter layer, I have an idea to keep a graph differential. Use very big parameter known as thermodynamic beta in softmax.

import tensorflow as tf

class OneHot(tf.keras.layers.Layer):
    def __init__(self, infi=1e9):
        super(OneHot, self).__init__()
        self.infi = infi

    def call(self, x):
        return tf.nn.softmax(self.infi * x) # x has shape [B, 9]

    def get_config(self):
        return {'infi': self.infi}

See how it works below.

>>> a = tf.constant([[1.,2.], [3., 4.]], dtype=tf.float32)
>>> tf.nn.softmax(a)
<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[0.26894143, 0.7310586 ],
       [0.26894143, 0.7310586 ]], dtype=float32)>
>>> tf.nn.softmax(1e9 * a)
<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[0., 1.],
       [0., 1.]], dtype=float32)>

I am not sure how this output layer makes learning better.

== My first answer below. Not recommended.

You can use tf.one_hot[3] to make it.

import tensorflow as tf

x = tf.constant([0.01,0.41,0.02,0.32,0.01,0.43], dtype=tf.float32)
i = tf.argmax(x)
y = tf.one_hot(i, 6)
# <tf.Tensor: shape=(6,), dtype=float32, numpy=array([0., 0., 0., 0., 0., 1.], dtype=float32)>

If you want to make a keras layer, make custom layer[4].

import tensorflow as tf

class OneHot(tf.keras.layers.Layer):
    def __init__(self):
        super(OneHot, self).__init__()

    def call(self, x):
        i = tf.argmax(x, axis=1) # x has shape [B, 9]
        return tf.one_hot(i, 9)

[1] https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalCrossentropy
[2] https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy
[3] https://www.tensorflow.org/api_docs/python/tf/one_hot
[4] https://www.tensorflow.org/guide/keras/custom_layers_and_models

Upvotes: 2

Canasta
Canasta

Reputation: 228

index = tf.argmax(one_hot_vector, axis=0)

Searching with "one hot decode" may help you.

Upvotes: -1

Related Questions