Multi Output Convolutional Neural Network

Question

I am developing a convolutional neural network for image classification or better for classification of license plates. These license plates contain up to 8 characters and for each character 37 characters are possible (A-Z, 0-9 and a blank space). I am now wondering how to design the two last layers in my network. I think, the last one has to be a softmax layer with 37 probabilities. This should be fully connected to one(?) neuron in the layer before? I think, in the layer before we need 8 neurons because of the 8 characters in the license plate before but I am not sure here. Before this layers I add some convolutional and maxPooling layers. Is that a valid approach or do you have other suggestions?

I wrote this code:

from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), input_shape = (600, 1200, 1), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(64, activation = "relu"))
model.add(Dense(8, activation = "relu"))
model.add(Dense(37, activation = "softmax"))

model.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy"])

Especially regarding the layers after my Flatten Layer I am really unsure... Is there someone who can help? I hope I described my problem properly...

Sreyas · Accepted Answer

On your previous layers, there are many commonly used architectures which you could try to obtain better accuracy on your dataset.

On the dense layers, there are multiple ways you could tackle it. Since, there are atmost 8 characters with 37 possible characters each. You could have the last layer as model.add(Dense(37*8, activation = "sigmoid")) with thresholding at 0.5 to denote all the 37*8 possibilities.

 model = Sequential()
 model.add(Conv2D(32, kernel_size=(3, 3), input_shape = (600, 1200, 1), activation = "relu"))
 model.add(MaxPooling2D(pool_size=(2, 2)))
 model.add(Dropout(0.25))
 model.add(Flatten())
 model.add(Dense(64, activation = "relu"))
 model.add(Dense(37*8, activation = "relu"))

A more apt way could be having 9 output layers: one with 8 neurons to denote the the presence of a character and other 8 layers with 37 neurons each with softmax to denote what character it is. Note that to do this you must use Functional API instead of Sequential API.

An example:

 inp = Input(shape=(600,1200,1))
 X = Conv2D(32, kernel_size=(3, 3), activation = "relu")(inp)
 X = MaxPooling2D(pool_size=(2, 2))(X)
 X = Dropout(0.25)(X)
 X = Flatten()(X)
 X = Dense(64, activation = "relu")(X)
 P = Dense(8, activation = "relu")(X)
 C1 = Dense(37, activation = "softmax")(X)
 ...
 C8 = Dense(37, activation = "softmax")(X)
 model = Model(inp, [P,C1,C2,...C8])

Multi Output Convolutional Neural Network

Answers (1)

Related Questions