sim
sim

Reputation: 23

custom layer with diagonal weight matrix

I want to implement a classifier with a sparse input layer. My data has about 60 dimenions and I want to check for feature importance. To do I this want the first layer to have a diagonal weight matrix(to which I want to apply a L1 kernel regularizer ), all off diagonals should be non trainable zeros. So a one to one connection per input channel, a Dense layer would mix the input variables. I checked Specify connections in NN (in keras) and Custom connections between layers Keras. The latter one I could not use as Lambda layers do not introduce trainable weights.

Something like this however does not affect the actual weight matrix:

class MyLayer(Layer):
def __init__(self, output_dim,connection, **kwargs):
    self.output_dim = output_dim
    self.connection=connection
    super(MyLayer, self).__init__(**kwargs)

def build(self, input_shape):
    # Create a trainable weight variable for this layer.
    self.kernel = self.add_weight(name='kernel', 
                                  shape=(input_shape[1], self.output_dim),
                                  initializer='uniform',
                                  trainable=True)
    self.kernel=tf.linalg.tensor_diag_part(self.kernel)
    self.kernel=tf.linalg.tensor_diag(self.kernel)
    super(MyLayer, self).build(input_shape)  # Be sure to call this at the end

def call(self, x):
    return K.dot(x, self.kernel)

def compute_output_shape(self, input_shape):
    return (input_shape[0], self.output_dim)

When I train the model and print the weights, I do not get a diagonal matrix for the first layer.

What am I doing wrong?

Upvotes: 2

Views: 2267

Answers (1)

pitfall
pitfall

Reputation: 2621

Not quite sure what you want to do exactly, because, for me, diagonal is something for a square matrix, implying your layer input and output dimensionality should be unchanged.

Anyway, let's talk about the square matrix case first. I think there are two ways of implementing a weight matrix with all zeros values off the diagonal.

Method 1: only conceptually follow the square matrix idea, and implement this layer with a trainable weight vector as follows.

# instead of writing y = K.dot(x,W), 
# where W is the weight NxN matrix with zero values of the diagonal.
# write y = x * w, where w is the weight vector 1xN

Method 2: use the default Dense layer, but with your own constraint.

# all you need to create a mask matrix M, which is a NxN identity matrix
# and you can write a contraint like below
class DiagonalWeight(Constraint):
    """Constrains the weights to be diagonal.
    """
    def __call__(self, w):
        N = K.int_shape(w)[-1]
        m = K.eye(N)
        w *= m
        return w

Of course, you should use Dense( ..., kernel_constraint=DiagonalWeight()).

Upvotes: 4

Related Questions