MLP for speech recognition

Question

I am trying to learn speech recognition and so I am using a simple MLP for starters.

Below is the code:

#Simple MLP model

num_labels = Y.shape[1]
filter_size = 2

# Construct model 
model = Sequential()

model.add(Dense(256, input_shape=(32,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(num_labels))
model.add(Activation('softmax'))

# Compile the model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')

# Display model architecture summary 
model.summary()

# Calculate pre-training accuracy 

score = model.evaluate(X_test, Y_test, verbose=2)
accuracy = 100*score[1]

print("Pre-training accuracy: %.4f%%" % accuracy)

I am using MFCC for feature extraction and MLB for one-hot encoding.

The shape of X_train,X_val,X_test,Y_train,Y_val and Y_test is as follows: (54296, 99, 32) (6787, 99, 32) (6788, 99, 32) (54296, 31) (6787, 31) (6788, 31)

I am getting following errors:

WARNING:tensorflow:Model was constructed with shape (None, 32) for input Tensor("dense_21_input:0", shape=(None, 32), dtype=float32), but it was called on an input with incompatible shape (None, 99, 32).

When I change the input_shape to (99,32,) the warning disappears. Can anybody explain me the reason?

ValueError: Shapes (None, 31) and (None, 99, 31) are incompatible (This one is when I try to calculate the pre-training accuracy)

I have no idea on how to deal with this error?

I look forward to receiving some help.

Thanks!

MLP for speech recognition

Answers (1)

Related Questions