Tensorflow reduce dimensions of rank 3 tensor

Question

I am trying to build a CLDNN that is researched in the paper here

After the convolutional layers, the features go through a dim-reduction layer. At the point when the features leave the conv layers, the dimensions are [?, N, M]. N represents the number of windows and I think the network requires the reduction in the dimension M, so the dimensions of the features after the dim-red layer is [?,N,Q] , where Q < M.

I have two questions.

How do I do this in TensorFlow? I tried using a weight with
```
W = tf.Variable( tf.truncated_normal([M,Q],stddev=0.1) )
```
I thought the multiplication of tf.matmul(x,W) would yield [?, N, Q] but [?, N, M] and [M, Q] are not valid dimensions for multiplication. I would like to keep N constant and reduce the dimension of M.
What kind of non-linearity should I apply to the outcome of tf.matmul(x,W)? I was thinking about using a ReLU but I couldn't even get #1 done.

hbaderts · Accepted Answer

According to the linked paper (T. N. Sainath et al.: "Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks"),

[...] reducing the dimensionality, such that we have 256 outputs from the linear layer, was appropriate.

That means, whatever the input size is, i.e. [?, N, M] or any other dimensionality (always assuming that the first dimension is the number of samples in a mini-batch, denoted by ?), the output will be [?, Q], where typically Q=256.

As we are doing dimensionality reduction by multiplying the input with a weight matrix, no spatial information will be preserved. This means, that it doesn't matter whether each input is a matrix or a vector, so we can reshape the input to the linear layer x to have the dimensions [?, N*M]. Then, we can create a simple matrix multiplication tf.matmul(x, W) where W is a matrix with dimensions [N*M, Q].

W = tf.Variable(tf.truncated_normal([N*M, Q], stddev=0.1))
x_vec = tf.reshape(x, shape=(-1, N*M))
y = tf.matmul(x_vec, W)

Finally, regarding question 2: in the paper, the dimensionality reduction layer is a linear layer, i.e. you do not apply a non-linearity to the output.

Tensorflow reduce dimensions of rank 3 tensor

Answers (1)

Related Questions