Tensorflow: my rnn always output same value, weights of rnn are not trained

Question

I used tensorflow to implement a simple RNN model to learn possible trends of time series data and predict future values. However, the model always produces same values after training. Actually, the best model it got is:

y = b.

The RNN structure is:

InputLayer -> BasicRNNCell -> Dense -> OutputLayer

RNN code:

def RNN(n_timesteps, n_input, n_output, n_units):    
    tf.reset_default_graph()
    X = tf.placeholder(dtype=tf.float32, shape=[None, n_timesteps, n_input])
    cells = [tf.contrib.rnn.BasicRNNCell(num_units=n_units)]    
    stacked_rnn = tf.contrib.rnn.MultiRNNCell(cells)
    stacked_output, states = tf.nn.dynamic_rnn(stacked_rnn, X, dtype=tf.float32)           
    stacked_output = tf.layers.dense(stacked_output, n_output)       
    return X, stacked_output

while in training, n_timesteps=1, n_input=1, n_output=1, n_units=2, learning_rate=0.0000001. And loss is calculated by mean squared error.

Input is a sequence of data in continuous days. Output is the data after the days of input.

(Maybe these are not good settings. But no matter how I change them, the results are almost same. So I just set these to help show them later.)

And I found out this is because weights and bias of BasicRNNCell are not trained. They keep same from beginning. And only the weights and bias of Dense keep changing. So in training, I got a prediction like these:

In the beginning:

loss: 1433683500.0
rnn/multi_rnn_cell/cell_0/cell0/kernel:0  [KEEP UNCHANGED]
rnn/multi_rnn_cell/cell_0/cell0/bias:0  [KEEP UNCHANGED]
dense/kernel:0  [CHANGING]
dense/bias:0   [CHANGING]

After a while:

loss: 175372340.0
rnn/multi_rnn_cell/cell_0/cell0/kernel:0 [KEEP UNCHANGED]
rnn/multi_rnn_cell/cell_0/cell0/bias:0 [KEEP UNCHANGED]
dense/kernel:0 [CHANGING]
dense/bias:0 [CHANGING]

The orange line indicates the true data, the blue line indicates results of my code. Through training, the blue line will keep going up until model gets a stable loss.

So I doubt whether I did a wrong implementation, so I generate a group of data with y = 10x + 5 for testing. This time, My model learns the correct results.

In the beginning:

In the end:

I have tried:

add more layers of both BasicRNNCell and Dense
increase rnn cell hidden num(n_units) to 128
decrease learning_rate to 1e-10
increase timesteps to 60

They all dont work.

So, my questions are:

Is it because my model is too simple? But I think the trend of my data is not so complicated to learn. At least something like y = ax + b will produce a smaller loss than y = b.
What may lead to these results?
Or how should I go on debugging?
And now, I double maybe BasicRNNCell is not fully realized, users should implement some functions of it? I have no experience with tensorflow before.

Tensorflow: my rnn always output same value, weights of rnn are not trained

Answers (1)

Related Questions