kds
kds

Reputation: 31

How to train a simple neural network to implement median filter?

Task: Given the random sequential number of {0,1,2,3,4}, train a neural network to find the position index of number "2". This network mimics the median filter which finds the index of the median number instead of the median number itself. For example, given the input [3,1,0,2,4], the output/label is "3" (or [0,0,0,1,0]).

I can construct a simple neural network manually by setting the kernel and bias weights to do the job. And it did the job perfectly. The keras code is as following:

val_len = 5

def GetModel():    
    inputs_img = keras.layers.Input(shape=(val_len, 1), name='rank')
    net = keras.layers.Conv1D(1, 1, activation='tanh', name='layer1', trainable=True, kernel_initializer=keras.initializers.RandomUniform(-5, 5), bias_initializer=keras.initializers.RandomUniform(-5, 5))(inputs_img)
    net = keras.layers.Lambda(lambda x: K.abs(x))(net)
    net = keras.layers.Conv1D(1, 1, activation='relu', name='layer2', trainable=True, kernel_initializer=keras.initializers.RandomUniform(-5, 5), bias_initializer=keras.initializers.RandomUniform(-5, 5))(net)
    net = keras.layers.Flatten()(net)
    net = keras.layers.Dense(units=val_len, name = 'mid_pos', activation='softmax', trainable=True, kernel_initializer=keras.initializers.RandomUniform(-5, 5), bias_initializer=keras.initializers.RandomUniform(-5, 5))(net)
    model = keras.models.Model(inputs=[inputs_img], outputs=[net])
    return model

model = GetModel()
# do 2 - x
model.get_layer('layer1').set_weights([np.array([[[-1]]], dtype=np.float32), np.array([2], dtype=np.float32)])
# do 1 - x
model.get_layer('layer2').set_weights([np.array([[[-1]]], dtype=np.float32), np.array([1], dtype=np.float32)])
# do 1-to-1 connection
model.get_layer('mid_pos').set_weights([np.array([[1,0,0,0,0],
                                              [0,1,0,0,0],
                                              [0,0,1,0,0],
                                              [0,0,0,1,0],
                                              [0,0,0,0,1]], dtype=np.float32),
                                    np.array([0,0,0,0,0], dtype=np.float32)])

However, this simple model can't learn weights from examples (I tried many optimizers). The best accuracy it can reach is 0.2 which is random selection from 5 categories. The model can start to learn "layer1" if we manually assign the weights of "layer2" and "mid_pos" layer.

My question is: 1. why this simple model fail to learn from example? 2. how to improve its learning ability? Thanks for your comments. (BTW, a generalized cnn is not a solution)

Upvotes: 2

Views: 523

Answers (1)

Daniel Möller
Daniel Möller

Reputation: 86630

About the model:

  • It seems your initializers are wildly big. Let the standard initializers instead.
  • Also, you're using relu with such a tiny dimensional data. The chance of getting all-zeros with it is great.

Upvotes: 2

Related Questions