How to Have Multiple Softmax Outputs in Tensorflow?

Question

I am trying to create a network in tensor flow with multiple softmax outputs, each of a different size. The network architecture is: Input -> LSTM -> Dropout. Then I have 2 softmax layers: Softmax of 10 outputs and Softmax of 20 Outputs. The reason for this is because I want to generate two sets of outputs (10 and 20), and then combine them to produce a final output. I'm not sure how to do this in Tensorflow.

Previously, to make a network like described, but with one softmax, I think I can do something like this.

inputs = tf.placeholder(tf.float32, [batch_size, maxlength, vocabsize])
lengths = tf.placeholders(tf.int32, [batch_size])
embeddings = tf.Variable(tf.random_uniform([vocabsize, 256], -1, 1))
lstm = {}
lstm[0] = tf.contrib.rnn.LSTMCell(hidden_layer_size, state_is_tuple=True, initializer=tf.contrib.layers.xavier_initializer(seed=random_seed))
lstm[0] = tf.contrib.rnn.DropoutWrapper(lstm[0], output_keep_prob=0.5)
lstm[0] = tf.contrib.rnn.MultiRNNCell(cells=[lstm[0]] * 1, state_is_tuple=True)
output_layer = {}
output_layer[0] = Layer.W(1 * hidden_layer_size, 20, 'OutputLayer')
output_bias = {}
output_bias[0] = Layer.b(20, 'OutputBias')
outputs = {}
fstate = {}
with tf.variable_scope("lstm0"):
    # create the rnn graph at run time
  outputs[0], fstate[0] = tf.nn.dynamic_rnn(lstm[0], tf.nn.embedding_lookup(embeddings, inputs),
                                      sequence_length=lengths, 
                                      dtype=tf.float32)
logits = {}
logits[0] = tf.matmul(tf.concat([f.h for f in fstate[0]], 1), output_layer[0]) + output_bias[0]
loss = {}
loss[0] = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits[0], labels=labels[0]))

However, now, I want my RNN output (after the dropout) to flow into 2 softmax layers, one of size 10 and another of size 20. Does anyone have an idea of how to do this?

Thanks

Edit: Ideally I would like to use a version of softmax such as what is defined here in this Knet Julia library. Does Tensorflow have an equivalent? https://github.com/denizyuret/Knet.jl/blob/1ef934cc58f9671f2d85063f88a3d6959a49d088/deprecated/src7/op/actf.jl#L103

Pop · Accepted Answer

You can do the following on the output of dynamic_rnn that you called output[0] in order to compute the two softmax and the corresponding losses:

with tf.variable_scope("softmax_0"):
    # Transform you RNN output to the right output size = 10
    W = tf.get_variable("kernel_0", [output[0].get_shape()[1], 10])
    logits_0 = tf.matmul(inputs, W)
    # Apply the softmax function to the logits (of size 10)
    output_0 = tf.nn.softmax(logits_0, name = "softmax_0")
    # Compute the loss (as you did in your question) with softmax_cross_entropy_with_logits directly applied on logits
    loss_0 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits_0, labels=labels[0]))

with tf.variable_scope("softmax_1"):  
    # Transform you RNN output to the right output size = 20
    W = tf.get_variable("kernel_1", [output[0].get_shape()[1], 20])
    logits_1 = tf.matmul(inputs, W)
    # Apply the softmax function to the logits (of size 20)
    output_1 = tf.nn.softmax(logits_1, name = "softmax_1")
    # Compute the loss (as you did in your question) with softmax_cross_entropy_with_logits directly applied on logits
    loss_1 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits_1, labels=labels[1]))

You can then combine the two losses if it is relevant to your application:

total_loss = loss_0 + loss_1

EDIT To answer your question in comment about what you specifically need to do with the two softmax outputs: you can do the following approximately:

with tf.variable_scope("second_part"):
    W1 = tf.get_variable("W_1", [output_1.get_shape()[1], n])
    W2 = tf.get_variable("W_2", [output_2.get_shape()[1], n])
    prediction = tf.matmul(output_1, W1) + tf.matmul(output_2, W2)
with tf.variable_scope("optimization_part"):
    loss = tf.reduce_mean(tf.squared_difference(prediction, label))

You just need to defined n, the number of columns of W1 and W2.

How to Have Multiple Softmax Outputs in Tensorflow?

Answers (2)

Related Questions