Assign Custom Layer's Weight from 1D to 2D

Question

I am writing a custom layer in Tensorflow 2.0 and I ran to a problem as follow:

I want to transform a 1D weight array (5x1) to a 2D array (10x10). Suppose I have the index to transform from 1D to 2D as follow, weight_index_lst:

weight_id, row, col
1,5,6
2,6,7
3,7,8
4,8,9
5,9,10

The others location of the 2D array will just get a value of 0. Here's my script for the custom layers. My input is in (10x1) shape. For the w_mat, it receives 0 anywhere else that self.w is not assigned

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

class mylayer(layers.Layer):
    def __init__(self, weight_index_lst, **kwargs):
        super(mylayer, self).__init__(**kwargs)
        self.weight_index_lst= weight_index_lst

    def build(self):
        self.w = self.add_weight(shape = (5,1),
                                 initializer = 'he_normal',
                                 trainable = True)

    def call(self, inputs):          
        ct = 0
        w_mat = tf.Variable(np.zeros((21, 21)),dtype='float32',trainable=False)
        for i in range(20):
            i1 = self.weight_index_lst[i,1] #row index
            i2 = self.weight_index_lst[i,2] #column index
            w_mat[i1,i2].assign(self.w[ct,0]) #problem with no gradient provided
            #or w_mat[i1,i2] = self.w[ct,0] #resource variable cannot be assigned
            ct = ct+1

        y = tf.matmul(w_mat,inputs)           
        return y

I could have declared a (10x10) weight array but my deep learning wants the others weight to be 0 and cannot be trained.

Alexander Pivovarov · Accepted Answer

If you want to specifically create a new layer with the weights and such then the resolution to your problem (no gradients propagating through assign) is to change all of your operations to be symbolic tensor operations - then TF will be able to propagate the gradients. One way to do so is to create 1d tensor of weights you want to train, append non-trainable const tensor with 0.0 value and then use tf.gather to select the needed weights and/or constant zero for each of n**2 elements of the matrix you want to use to multiply the layer's input by. Since all operations are symbolic tensor operations TF will be able to propagate gradients with no problems. Code of such approach below:

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

class mylayer(layers.Layer):
    def __init__(self, n, weight_index_lst, **kwargs):
        super(mylayer, self).__init__(**kwargs)
        self.weight_index_lst = weight_index_lst
        self.n = n

    def build(self, input_shape):
        self.w = self.add_weight(shape = (len(self.weight_index_lst),),
                                 initializer = 'he_normal',
                                 trainable = True)

    def call(self, inputs):
        const_zero = tf.constant([0.], dtype=tf.float32)
        const_zero_and_weights = tf.concat([const_zero, self.w], axis=0)
        ct = 1 # start with 1 since 0 means take the non-trainable 0. from const_zero_and_weights
        selector = np.zeros((self.n ** 2), dtype=np.int32) # indicies
        for i, j in self.weight_index_lst:
            selector[i * self.n + j] = ct
            ct = ct+1
        t_ind = tf.constant(selector, dtype=tf.int32)
        w_flattened = tf.gather(const_zero_and_weights, t_ind)
        w_matrix = tf.reshape(w_flattened, (self.n, self.n))
        y = tf.matmul(w_matrix, inputs)           
        return y

m = tf.keras.Sequential([
    layers.Dense(21**2, input_shape=(45,)),
    layers.Reshape(target_shape=(21,21)),
    mylayer(21, [(4,5), (5,6), (6,7), (7,8), (8,9)]),
    ])
m.summary()

Assign Custom Layer's Weight from 1D to 2D

Answers (2)

Related Questions

Assign Custom Layer&#39;s Weight from 1D to 2D

Answers (2)

Related Questions

Assign Custom Layer's Weight from 1D to 2D