Reputation: 31
Task: Given the random sequential number of {0,1,2,3,4}, train a neural network to find the position index of number "2". This network mimics the median filter which finds the index of the median number instead of the median number itself. For example, given the input [3,1,0,2,4], the output/label is "3" (or [0,0,0,1,0]).
I can construct a simple neural network manually by setting the kernel and bias weights to do the job. And it did the job perfectly. The keras code is as following:
val_len = 5
def GetModel():
inputs_img = keras.layers.Input(shape=(val_len, 1), name='rank')
net = keras.layers.Conv1D(1, 1, activation='tanh', name='layer1', trainable=True, kernel_initializer=keras.initializers.RandomUniform(-5, 5), bias_initializer=keras.initializers.RandomUniform(-5, 5))(inputs_img)
net = keras.layers.Lambda(lambda x: K.abs(x))(net)
net = keras.layers.Conv1D(1, 1, activation='relu', name='layer2', trainable=True, kernel_initializer=keras.initializers.RandomUniform(-5, 5), bias_initializer=keras.initializers.RandomUniform(-5, 5))(net)
net = keras.layers.Flatten()(net)
net = keras.layers.Dense(units=val_len, name = 'mid_pos', activation='softmax', trainable=True, kernel_initializer=keras.initializers.RandomUniform(-5, 5), bias_initializer=keras.initializers.RandomUniform(-5, 5))(net)
model = keras.models.Model(inputs=[inputs_img], outputs=[net])
return model
model = GetModel()
# do 2 - x
model.get_layer('layer1').set_weights([np.array([[[-1]]], dtype=np.float32), np.array([2], dtype=np.float32)])
# do 1 - x
model.get_layer('layer2').set_weights([np.array([[[-1]]], dtype=np.float32), np.array([1], dtype=np.float32)])
# do 1-to-1 connection
model.get_layer('mid_pos').set_weights([np.array([[1,0,0,0,0],
[0,1,0,0,0],
[0,0,1,0,0],
[0,0,0,1,0],
[0,0,0,0,1]], dtype=np.float32),
np.array([0,0,0,0,0], dtype=np.float32)])
However, this simple model can't learn weights from examples (I tried many optimizers). The best accuracy it can reach is 0.2 which is random selection from 5 categories. The model can start to learn "layer1" if we manually assign the weights of "layer2" and "mid_pos" layer.
My question is: 1. why this simple model fail to learn from example? 2. how to improve its learning ability? Thanks for your comments. (BTW, a generalized cnn is not a solution)
Upvotes: 2
Views: 523
Reputation: 86630
About the model:
Upvotes: 2