Reputation: 97
I am completely new to this area so my question may sound stupid. I have the following model defined using Keras, which takes multiple inputs and outputs to predict one of the 2 outcomes:
inputs = []
outputs = []
for feature in features:
length = feature.length
input = tf.keras.Input(batch_size=batch_size, shape=(length,), sparse=False,
name=feature.name)
output = tf.keras.layers.Dense(units=2)(raw_input)
inputs.append(input)
outputs.append(output)
logits = tf.keras.layers.add(outputs)
##########################################################
# Opt1: probabilities = tf.keras.activation.softmax(logits)
# Opt2: probabilities = tf.keras.layers.Softmax()(logits)
# Opt3: probabilities = tf.keras.layers.Softmax(name="label")(logits)
##########################################################
model = tf.keras.Model(inputs=inputs, outputs=probabilities)
model.compile(loss=tf.keras.losses.BinaryCrossentropy(),
metric=["accuracy"])
I want the model to output to show the probability of one of the 2 predicted outcome, so I attempt to do softmax on the logits as output.
I have tried the 3 options, as shown above in the code (aka: Opt1, Opt2, Opt3).
Opt1 gives the following error:
ValueError: No data provided for "tf_op_layer_Softmax". Need data for each key in: ['tf_op_layer_Softmax']
Opt2 gives a similar error:
ValueError: No data provided for "softmax". Need data for each key in: ['softmax']
However, Opt3 runs just fine, despite that it is same as Opt2, except with a different name.
My questions are main the following: 1. In a Keras model, how do we usually directly apply a softmax onto logits without creating another layer? 2. What is the difference between Opt2 and Opt3, since its just a name change?
Thanks for the help
Upvotes: 0
Views: 1157
Reputation: 11198
activation
parameter which you can use to apply softmax.For example, in a Dense layer, you can say something like,
dense = tf.keras.layers.Dense(10, activation = 'softmax')(in_layer)
There is no such parameter for add()
so you can use another activation layer.
softmax_out = tf.keras.layers.Activation('softmax')(in_layer)
output = tf.keras.layers.Dense(units=2)(raw_input)
Where this raw_input
is coming from? There's no practical difference between them, it's most probably just the way you've designed the network.
Here's the fixed version which works with Op2.
import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, Flatten
from tensorflow.keras import Model
from tensorflow.keras.losses import Loss
import matplotlib.pyplot as plt
inputs = []
outputs = []
class Feature:
def __init__(self, len_ = 10, name_ = 'unk'):
self.length = len_
self.name = name_
features = []
for i in range(5):
f = Feature(name_ = 'unk' + str(i))
features.append(f)
batch_size = 10
for feature in features:
length = feature.length
input = tf.keras.Input(batch_size=batch_size, shape=(length,), sparse=False,
name=feature.name)
output = tf.keras.layers.Dense(units=2)(input)
inputs.append(input)
outputs.append(output)
logits = tf.keras.layers.add(outputs)
probabilities = tf.keras.layers.Softmax()(logits)
model = tf.keras.Model(inputs=inputs, outputs=probabilities)
model.compile(loss=tf.keras.losses.BinaryCrossentropy(),
metrics=["accuracy"])
model.summary()
Model: "model_3"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
unk0 (InputLayer) [(10, 10)] 0
__________________________________________________________________________________________________
unk1 (InputLayer) [(10, 10)] 0
__________________________________________________________________________________________________
unk2 (InputLayer) [(10, 10)] 0
__________________________________________________________________________________________________
unk3 (InputLayer) [(10, 10)] 0
__________________________________________________________________________________________________
unk4 (InputLayer) [(10, 10)] 0
__________________________________________________________________________________________________
dense_17 (Dense) (10, 2) 22 unk0[0][0]
__________________________________________________________________________________________________
dense_18 (Dense) (10, 2) 22 unk1[0][0]
__________________________________________________________________________________________________
dense_19 (Dense) (10, 2) 22 unk2[0][0]
__________________________________________________________________________________________________
dense_20 (Dense) (10, 2) 22 unk3[0][0]
__________________________________________________________________________________________________
dense_21 (Dense) (10, 2) 22 unk4[0][0]
__________________________________________________________________________________________________
add_3 (Add) (10, 2) 0 dense_17[0][0]
dense_18[0][0]
dense_19[0][0]
dense_20[0][0]
dense_21[0][0]
__________________________________________________________________________________________________
softmax_3 (Softmax) (10, 2) 0 add_3[0][0]
==================================================================================================
Total params: 110
Trainable params: 110
Non-trainable params: 0
Upvotes: 1