Isaac Breen
Isaac Breen

Reputation: 160

Naming weights of a layer within a custom layer

I have a custom layer within a Dense sublayer. I want to be able to name the weights of this sublayer. However, using name="my_dense" on the sublayer initializer doesn't seem to do this; the weights simply get named after the outer custom layer.

To illustrate the problem, suppose I want a custom layer that simply stacks two dense layers. I'll print the names of the weights of this custom layer.

class DoubleDense(keras.layers.Layer):
  def __init__(self, units, **kwargs):
    self.dense1 = keras.layers.Dense(units, name="first_dense")
    self.dense2 = keras.layers.Dense(units, name="second_dense")
    super(DoubleDense, self).__init__(**kwargs)

  def build(self, input_shape):
    self.dense1.build(input_shape)
    self.dense2.build(self.dense1.units)

  def call(self, input):
    hidden = self.dense1(input)
    return self.dense2(hidden)

dd = DoubleDense(3)

# We need to evaluate the layer once to build the weights
trivial_input = tf.ones((1,10))
output = dd(trivial_input)

# Print the names of all variables in the DoubleDense layer
print([weight.name for weight in dd.weights])

The output is this:

['double_dense_1/kernel:0',
 'double_dense_1/bias:0',
 'double_dense_1/kernel:0',
 'double_dense_1/bias:0']

...but I was expecting something more like this:

['double_dense_1/first_dense_1/kernel:0',
 'double_dense_1/first_dense_1/bias:0',
 'double_dense_1/second_dense_1/kernel:0',
 'double_dense_1/second_dense_1/bias:0']

So, Keras has named these weights ambiguously; there is no way to tell whether a weight tensor belongs to dd.dense1 or dd.dense2 by its name alone. I realise I could select the layer first and then the weights (dd.dense1.weights), but I would prefer not to do this in my application.

Is there a way to name the weights of a sublayer of a custom layer?

Upvotes: 3

Views: 1403

Answers (1)

user11530462
user11530462

Reputation:

If you want the name for the subclass layers you need to include name_scope and then call build for each layer.

Below is the modified code which will give names for each layer in the output.

class DoubleDense(keras.layers.Layer):
  def __init__(self, units, **kwargs):
    self.dense1 = keras.layers.Dense(units)
    self.dense2 = keras.layers.Dense(units)
    super(DoubleDense, self).__init__( **kwargs)

  def build(self, input_shape):
    with tf.name_scope("first_dense"):
      self.dense1.build(input_shape)
    with tf.name_scope("second_dense"):
      self.dense2.build(self.dense1.units)

  def call(self, input):
    hidden = self.dense1(input)
    return self.dense2(hidden)


dd = DoubleDense(3)


# We need to evaluate the layer once to build the weights
trivial_input = tf.ones((1,10))
output = dd(trivial_input)

# Print the names of all variables in the DoubleDense layer
print([weight.name for weight in dd.weights])  

Output:

['double_dense/first_dense/kernel:0', 'double_dense/first_dense/bias:0', 'double_dense/second_dense/kernel:0', 'double_dense/second_dense/bias:0']  

Hope this answers your question, Happy Learning!

Upvotes: 4

Related Questions