Reputation: 376
In TF 1.x, it was possible to build layers with custom variables. Here's an example:
import numpy as np
import tensorflow as tf
def make_custom_getter(custom_variables):
def custom_getter(getter, name, **kwargs):
if name in custom_variables:
variable = custom_variables[name]
else:
variable = getter(name, **kwargs)
return variable
return custom_getter
# Make a custom getter for the dense layer variables.
# Note: custom variables can result from arbitrary computation;
# for the sake of this example, we make them just constant tensors.
custom_variables = {
"model/dense/kernel": tf.constant(
np.random.rand(784, 64), name="custom_kernel", dtype=tf.float32),
"model/dense/bias": tf.constant(
np.random.rand(64), name="custom_bias", dtype=tf.float32),
}
custom_getter = make_custom_getter(custom_variables)
# Compute hiddens using a dense layer with custom variables.
x = tf.random.normal(shape=(1, 784), name="inputs")
with tf.variable_scope("model", custom_getter=custom_getter):
Layer = tf.layers.Dense(64)
hiddens = Layer(x)
print(Layer.variables)
The printed variables of the constructed dense layer will be custom tensors we specified in the custom_variables
dict:
[<tf.Tensor 'custom_kernel:0' shape=(784, 64) dtype=float32>, <tf.Tensor 'custom_bias:0' shape=(64,) dtype=float32>]
This allows us to create layers/models that use provided tensors in custom_variables
directly as their weights, so that we could further differentiate the output of the layers/models with respect to any tensors that custom_variables
may depend on (particularly useful for implementing functionality in modulating sub-nets, parameter generation, meta-learning, etc.).
Variable scopes used to make it easy to nest all off graph-building inside scopes with custom getters and build models on top of the provided tensors as their parameters. Since sessions and variable scopes are no longer advisable in TF 2.0 (and all of that low-level stuff is moved to tf.compat.v1
), what would be the best practice to implement the above using Keras and TF 2.0?
(Related issue on GitHub.)
Upvotes: 6
Views: 3672
Reputation: 86600
Given you have:
kernel = createTheKernelVarBasedOnWhatYouWant() #shape (784, 64)
bias = createTheBiasVarBasedOnWhatYouWant() #shape (64,)
Make a simple function copying the code from Dense
:
def custom_dense(x):
inputs, kernel, bias = x
outputs = K.dot(inputs, kernel)
outputs = K.bias_add(outputs, bias, data_format='channels_last')
return outputs
Use the function in a Lambda
layer:
layer = Lambda(custom_dense)
hiddens = layer([x, kernel, bias])
Warning:
kernel
andbias
must be produced from a Keras layer, or come from ankernel = Input(tensor=the_kernel_var)
andbias = Input(tensor=bias_var)
If the warning above is bad for you, you can always use kernel
and bias
"from outside", like:
def custom_dense(inputs):
outputs = K.dot(inputs, kernel) #where kernel is not part of the arguments anymore
outputs = K.bias_add(outputs, bias, data_format='channels_last')
return outputs
layer = Lambda(custom_dense)
hiddens = layer(x)
This last option makes it a bit more complicated to save/load models.
You should probably use a Keras Dense layer and set its weights in a standard way:
layer = tf.keras.layers.Dense(64, name='the_layer')
layer.set_weights([np.random.rand(784, 64), np.random.rand(64)])
If you need that these weights are not trainable, before compiling the keras model you set:
model.get_layer('the_layer').trainable=False
If you want direct access to the variables as tensors, they are:
kernel = layer.kernel
bias = layer.bias
There are plenty of other options, but that depends on your exact intention, which is not clear in your question.
Upvotes: 2
Reputation: 376
Below is a general-purpose solution that works with arbitrary Keras models in TF2.
First, we need to define an auxiliary function canonical_variable_name
and a context manager custom_make_variable
with the following signatures (see implementation in meta-blocks library).
def canonical_variable_name(variable_name: str, outer_scope: str):
"""Returns the canonical variable name: `outer_scope/.../name`."""
# ...
@contextlib.contextmanager
def custom_make_variable(
canonical_custom_variables: Dict[str, tf.Tensor], outer_scope: str
):
"""A context manager that overrides `make_variable` with a custom function.
When building layers, Keras uses `make_variable` function to create weights
(kernels and biases for each layer). This function wraps `make_variable` with
a closure that infers the canonical name of the variable being created (of the
form `outer_scope/.../var_name`) and looks it up in the `custom_variables` dict
that maps canonical names to tensors. The function adheres the following logic:
* If there is a match, it does a few checks (shape, dtype, etc.) and returns
the found tensor instead of creating a new variable.
* If there is a match but checks fail, it throws an exception.
* If there are no matching `custom_variables`, it calls the original
`make_variable` utility function and returns a newly created variable.
"""
# ...
Using these functions, we can create arbitrary Keras models with custom tensors used as variables:
import numpy as np
import tensorflow as tf
canonical_custom_variables = {
"model/dense/kernel": tf.constant(
np.random.rand(784, 64), name="custom_kernel", dtype=tf.float32),
"model/dense/bias": tf.constant(
np.random.rand(64), name="custom_bias", dtype=tf.float32),
}
# Compute hiddens using a dense layer with custom variables.
x = tf.random.normal(shape=(1, 784), name="inputs")
with custom_make_variable(canonical_custom_variables, outer_scope="model"):
Layer = tf.layers.Dense(64)
hiddens = Layer(x)
print(Layer.variables)
Upvotes: 0
Reputation: 5188
Not entirely sure I understand your question correctly, but it seems to me that it should be possible to do what you want with a combination of custom layers and keras functional api.
Custom layers allow you to build any layer you want in a way that is compatible with Keras, e.g.:
class MyDenseLayer(tf.keras.layers.Layer):
def __init__(self, num_outputs):
super(MyDenseLayer, self).__init__()
self.num_outputs = num_outputs
def build(self, input_shape):
self.kernel = self.add_weight("kernel",
shape=[int(input_shape[-1]),
self.num_outputs],
initializer='normal')
self.bias = self.add_weight("bias",
shape=[self.num_outputs,],
initializer='normal')
def call(self, inputs):
return tf.matmul(inputs, self.kernel) + self.bias
and the functional api allows you to access the outputs of said layers and re-use them:
inputs = keras.Input(shape=(784,), name='img')
x1 = MyDenseLayer(64, activation='relu')(inputs)
x2 = MyDenseLayer(64, activation='relu')(x1)
outputs = MyDenseLayer(10, activation='softmax')(x2)
model = keras.Model(inputs=inputs, outputs=outputs, name='mnist_model')
Here x1
and x2
can be connected to other subnets.
Upvotes: -1