mo-seph
mo-seph

Reputation: 6223

Dimensionality of Keras Dense layer

I've got a Keras model, with sizes as follows:

________________________________________________________________________________
Layer (type)              Output Shape              Param #
================================================================================
stft (InputLayer)         (None, 1, 16384)          0
________________________________________________________________________________
static_stft (Spectrogram) (None, 1, 65, 256)        16640
________________________________________________________________________________
conv2d_1 (Conv2D)         (None, 38, 5, 9)          12882
________________________________________________________________________________
dense_1 (Dense)           (None, 38, 5, 512)        5120
________________________________________________________________________________
predictions (Dense)       (None, 38, 5, 368)        188784
================================================================================

I'm confused about the dimensionality of the Dense layers at the end. I was hoping to have (None,512) and (None,368) respectively. This is suggested by answers like: Keras lstm and dense layer

They final dense layers are created as follows:

x = keras.layers.Dense(512)(x)
outputs = keras.layers.Dense(
        368, activation='sigmoid', name='predictions')(x)

So why do they have more than 512 outputs? And how can I change this?

Upvotes: 0

Views: 419

Answers (2)

Will.Evo
Will.Evo

Reputation: 1177

Depending on your application you could flatten after the Conv2D layer:

input_layer = Input((1, 1710))
x = Reshape((38, 5, 9))(input_layer)
x = Flatten()(x)
x = Dense(512)(x)
x = Dense(368)(x)

Layer (type)                 Output Shape              Param #   
_________________________________________________________________
input_1 (InputLayer)         [(None, 1, 1710)]         0         
_________________________________________________________________
reshape (Reshape)            (None, 38, 5, 9)          0         
_________________________________________________________________
flatten (Flatten)            (None, 1710)              0         
_________________________________________________________________
dense (Dense)                (None, 512)               876032    
_________________________________________________________________
dense_1 (Dense)              (None, 368)               188784    

Upvotes: 1

Mark Snyder
Mark Snyder

Reputation: 1665

It's the Conv2D layer. The convolutional layer is producing 38x5 outputs of length 9, and then your Dense layer is taking each of the 38x5 length 9 sequences as input and converting it to a length 512 sequence as output.

To get rid of the spatial dependence, you'll want to use something like a pooling layer, possibly a GlobalMaxPool2D. This will consolidate the data into only the channel dimension, and produce a (None, 9) shaped output, which will lead to your expected shapes from the Dense layers.

Upvotes: 0

Related Questions