Reputation: 85
I'm trying to create a model for ordinal regression as explained by this paper . A major part of it is sharing weights in the final layer but not the bias in order to obtain rank monotonicity(Basically to ensure P[Y>N] must always be greater than P[Y>N-1] for any such N). This is highly desirable for me since I have a couple of values for which there are very few values but I still would prefer to get their probabilities. As of now I've implemented the way it encodes numbers and there isn't rank monotonicity as sometimes the probability of P(Y>5) > P(Y>4).
How exactly can I accomplish weight sharing but not bias sharing in Keras? I know the functional API has a way to share weights and biases but that doesn't help in this scenario. Thanks to anyone that can help.
Edit: Either sharing the weights but not biases within one layer with N neurons and between N layers would solve the problem. Also I think setting the use_bias argument in Dense() to false and creating a custom Bias layer of some sort can also solve the problem but I am not sure how to do that
The equation for six neurons and five inputs would be this I think
op1 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b1
op2 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b2
op3 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b3
op4 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b4
op5 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b5
op6 = w1z1 + w2z2 + w3z3 + w4z4 + w5z5 + b6
where w1 to w5 are weights, z1 to z5 are inputs, and b1 to b6 are the bias terms
Upvotes: 6
Views: 1985
Reputation: 2642
One of the ways you can achieve this is by defining a custom bias
layer, and here is how you could do this.
PS: Change input shapes/ initializer according to your need.
import tensorflow as tf
print('TensorFlow:', tf.__version__)
class BiasLayer(tf.keras.layers.Layer):
def __init__(self, units, *args, **kwargs):
super(BiasLayer, self).__init__(*args, **kwargs)
self.bias = self.add_weight('bias',
shape=[units],
initializer='zeros',
trainable=True)
def call(self, x):
return x + self.bias
z1 = tf.keras.Input(shape=[1])
z2 = tf.keras.Input(shape=[1])
z3 = tf.keras.Input(shape=[1])
z4 = tf.keras.Input(shape=[1])
z5 = tf.keras.Input(shape=[1])
dense_layer = tf.keras.layers.Dense(units=10, use_bias=False)
op1 = BiasLayer(units=10)(dense_layer(z1))
op2 = BiasLayer(units=10)(dense_layer(z2))
op3 = BiasLayer(units=10)(dense_layer(z3))
op4 = BiasLayer(units=10)(dense_layer(z4))
op5 = BiasLayer(units=10)(dense_layer(z5))
model = tf.keras.Model(inputs=[z1, z2, z3, z4, z5], outputs=[op1, op2, op3, op4, op5])
model.summary()
Output:
TensorFlow: 2.1.0-dev20200107
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
input_2 (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
input_3 (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
input_4 (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
input_5 (InputLayer) [(None, 1)] 0
__________________________________________________________________________________________________
dense (Dense) (None, 10) 10 input_1[0][0]
input_2[0][0]
input_3[0][0]
input_4[0][0]
input_5[0][0]
__________________________________________________________________________________________________
bias_layer (BiasLayer) (None, 10) 10 dense[0][0]
__________________________________________________________________________________________________
bias_layer_1 (BiasLayer) (None, 10) 10 dense[1][0]
__________________________________________________________________________________________________
bias_layer_2 (BiasLayer) (None, 10) 10 dense[2][0]
__________________________________________________________________________________________________
bias_layer_3 (BiasLayer) (None, 10) 10 dense[3][0]
__________________________________________________________________________________________________
bias_layer_4 (BiasLayer) (None, 10) 10 dense[4][0]
==================================================================================================
Total params: 60
Trainable params: 60
Non-trainable params: 0
__________________________________________________________________________________________________
Upvotes: 5