Reputation: 47
I have 530 data points belonging to 10 classes. I am not sure which numbers should I use for the num_rows
and num_columns
.
In this code I have num_rows = 40
, num_columns = 174
:
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=2, input_shape=(num_rows, num_columns, num_channels), activation='relu'))
model.add(MaxPooling2D(pool_size=2))
#model.add(Dropout(0.2))
model.add(Conv2D(filters=64, kernel_size=2, kernel_regularizer=l2(0.00001), bias_regularizer=l2(0.0001), activation='relu'))
model.add(MaxPooling2D(pool_size=2))
#model.add(Dropout(0.2))
model.add(Conv2D(filters=128, kernel_size=2, kernel_regularizer=l2(0.00001), bias_regularizer=l2(0.0001), activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(filters=128, kernel_size=2, kernel_regularizer=l2(0.00001), bias_regularizer=l2(0.0001), activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
#model.add(GlobalAveragePooling2D())
model.add(Flatten())
model.add(Dense(512, activation='relu'))
#model.add(Dropout(0.2))
model.add(Dense(256, activation='relu'))
#model.add(Dropout(0.2))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(10, activation='softmax'))
# Compile the model
#opt = keras.optimizers.Adam(learning_rate=0.001)
model.compile(loss='categorical_crossentropy', metrics=\['accuracy'\], optimizer="Adam")
Upvotes: 2
Views: 63
Reputation: 11377
I am guessing you have some sort of spectrograms on your input (since you're working with audio, but have 3-dimensional shape on input). Your input_shape
has to reflect the size of images that you pass on input. Simply check their width and height - these are your num_rows
and num_columns
.
According to that code, the images have 3 colour bands. That makes sense for photos, but rarely for spectrograms. Remember these are false colours that typically are generated to create visually-pleasing visualisations, but don't get you anything when doing classification. Single channel is enough, the pixel intensity reflects strength (amplitude) of the signal.
Three simple things you can do:
input_shape=(num_rows, num_columns, 1)
. Colour only confuses the classifier.kernel_size=2
makes little sense. Read on convolutions first and what are the kernels.Upvotes: 3