Understanding shapes in keras layers

Question

I am learning Tensorflow and Keras to implement LSTM many-to-many model where the length of input sequence is equal to the length of the output sequence.

Sample Code:

Inputs:

voc_size = 10000
embed_dim = 64
lstm_units = 75
size_batch = 30
count_classes = 5

Model:

from tensorflow.keras.layers import ( Bidirectional, LSTM, 
                                Dense, Embedding, TimeDistributed )
from tensorflow.keras import Sequential

def sample_build(embed_dim, voc_size, batch_size, lstm_units, count_classes):
    model = Sequential()
    model.add(Embedding(input_dim=voc_size, 
                     output_dim=embed_dim,input_length=50))
    model.add(Bidirectional(LSTM(units=lstm_units,return_sequences=True),
                                         merge_mode="ave"))
    model.add(Dense(200))
    model.add(TimeDistributed(Dense(count_classes+1)))

    # Compile model
    model.compile(loss='categorical_crossentropy', 
                      optimizer='rmsprop', 
                      metrics=['accuracy'])
    model.summary()
    return model 


sample_model = sample_build(embed_dim,voc_size, 
                                    size_batch, rnn_units,
                                    count_classes)

I am having trouble understanding the shapes of input and output for each layer. For example, the shape of the output of Embedding_Layer is (BATCH_SIZE, time_steps, length_of_input) and in this case, it is (30, 50, 64).

Similarly, the output shape of Bidirectional LSTM later is (30, 50, 75). This is will be the input for the next Dense Layer with 200 units. But the shape of the weight matrix of Dense Layer is (number of units in the current layer, number of units in the previous layer, which is (200,75) in this case. So how does the matrix calculation happen between 2D shape of the Dense Layer and the 3D shape of the Bidirectional Layer? Any explanations on the shape clarification will be helpful

Understanding shapes in keras layers

Answers (1)

Related Questions