amazonsx
amazonsx

Reputation: 71

Tensorflow: my rnn always output same value, weights of rnn are not trained

I used tensorflow to implement a simple RNN model to learn possible trends of time series data and predict future values. However, the model always produces same values after training. Actually, the best model it got is:

y = b.

The RNN structure is:

InputLayer -> BasicRNNCell -> Dense -> OutputLayer

RNN code:

def RNN(n_timesteps, n_input, n_output, n_units):    
    tf.reset_default_graph()
    X = tf.placeholder(dtype=tf.float32, shape=[None, n_timesteps, n_input])
    cells = [tf.contrib.rnn.BasicRNNCell(num_units=n_units)]    
    stacked_rnn = tf.contrib.rnn.MultiRNNCell(cells)
    stacked_output, states = tf.nn.dynamic_rnn(stacked_rnn, X, dtype=tf.float32)           
    stacked_output = tf.layers.dense(stacked_output, n_output)       
    return X, stacked_output

while in training, n_timesteps=1, n_input=1, n_output=1, n_units=2, learning_rate=0.0000001. And loss is calculated by mean squared error.

Input is a sequence of data in continuous days. Output is the data after the days of input.

(Maybe these are not good settings. But no matter how I change them, the results are almost same. So I just set these to help show them later.)

And I found out this is because weights and bias of BasicRNNCell are not trained. They keep same from beginning. And only the weights and bias of Dense keep changing. So in training, I got a prediction like these:

In the beginning:

enter image description here

loss: 1433683500.0
rnn/multi_rnn_cell/cell_0/cell0/kernel:0  [KEEP UNCHANGED]
rnn/multi_rnn_cell/cell_0/cell0/bias:0  [KEEP UNCHANGED]
dense/kernel:0  [CHANGING]
dense/bias:0   [CHANGING]

After a while:

enter image description here

loss: 175372340.0
rnn/multi_rnn_cell/cell_0/cell0/kernel:0 [KEEP UNCHANGED]
rnn/multi_rnn_cell/cell_0/cell0/bias:0 [KEEP UNCHANGED]
dense/kernel:0 [CHANGING]
dense/bias:0 [CHANGING]

The orange line indicates the true data, the blue line indicates results of my code. Through training, the blue line will keep going up until model gets a stable loss.

So I doubt whether I did a wrong implementation, so I generate a group of data with y = 10x + 5 for testing. This time, My model learns the correct results.

In the beginning:

enter image description here

In the end:

enter image description here

I have tried:

  1. add more layers of both BasicRNNCell and Dense
  2. increase rnn cell hidden num(n_units) to 128
  3. decrease learning_rate to 1e-10
  4. increase timesteps to 60

They all dont work.

So, my questions are:

  1. Is it because my model is too simple? But I think the trend of my data is not so complicated to learn. At least something like y = ax + b will produce a smaller loss than y = b.
  2. What may lead to these results?
  3. Or how should I go on debugging?
  4. And now, I double maybe BasicRNNCell is not fully realized, users should implement some functions of it? I have no experience with tensorflow before.

Upvotes: 3

Views: 903

Answers (1)

rst
rst

Reputation: 2724

It seems your net is just not fit for that kind of data, or from another point of view, your data is badly scaled. When adding the 4 lines below after the split_data, I get some sort of learning behavior, similar to the one with the a*x+b case

data = read_data(work_dir, input_file)
plot_data(data)
input_data, output_data, n_batches = split_data(data, n_timesteps, n_input, n_output)
# scale input and output data
input_data = input_data-input_data[0]
input_data = input_data/np.max(input_data)*1000
output_data = output_data-output_data[0]
output_data = output_data/np.max(output_data)*1000

enter image description here

Upvotes: 1

Related Questions