Reputation: 619
I would like to build a toy LSTM model for regression. This nice tutorial is already too complicated for a beginner.
Given a sequence of length time_steps
, predict the next value. Consider time_steps=3
and the sequences:
array([
[[ 1.],
[ 2.],
[ 3.]],
[[ 2.],
[ 3.],
[ 4.]],
...
the target values should be:
array([ 4., 5., ...
I define the following model:
# Network Parameters
time_steps = 3
num_neurons= 64 #(arbitrary)
n_features = 1
# tf Graph input
x = tf.placeholder("float", [None, time_steps, n_features])
y = tf.placeholder("float", [None, 1])
# Define weights
weights = {
'out': tf.Variable(tf.random_normal([n_hidden, 1]))
}
biases = {
'out': tf.Variable(tf.random_normal([1]))
}
#LSTM model
def lstm_model(X, weights, biases, learning_rate=0.01, optimizer='Adagrad'):
# Prepare data shape to match `rnn` function requirements
# Current data input shape: (batch_size, time_steps, n_features)
# Required shape: 'time_steps' tensors list of shape (batch_size, n_features)
# Permuting batch_size and time_steps
input dimension: Tensor("Placeholder_:0", shape=(?, 3, 1), dtype=float32)
X = tf.transpose(X, [1, 0, 2])
transposed dimension: Tensor("transpose_41:0", shape=(3, ?, 1), dtype=float32)
# Reshaping to (time_steps*batch_size, n_features)
X = tf.reshape(X, [-1, n_features])
reshaped dimension: Tensor("Reshape_:0", shape=(?, 1), dtype=float32)
# Split to get a list of 'time_steps' tensors of shape (batch_size, n_features)
X = tf.split(0, time_steps, X)
splitted dimension: [<tf.Tensor 'split_:0' shape=(?, 1) dtype=float32>, <tf.Tensor 'split_:1' shape=(?, 1) dtype=float32>, <tf.Tensor 'split_:2' shape=(?, 1) dtype=float32>]
# LSTM cell
cell = tf.nn.rnn_cell.LSTMCell(num_neurons) #Or GRUCell(num_neurons)
output, state = tf.nn.dynamic_rnn(cell=cell, inputs=X, dtype=tf.float32)
output = tf.transpose(output, [1, 0, 2])
last = tf.gather(output, int(output.get_shape()[0]) - 1)
return tf.matmul(last, weights['out']) + biases['out']
We instantiating the LSTM model with pred = lstm_model(x, weights, biases)
I get the following:
---> output, state = tf.nn.dynamic_rnn(cell=cell, inputs=X, dtype=tf.float32)
ValueError: Dimension must be 2 but is 3 for 'transpose_42' (op: 'Transpose') with input shapes: [?,1], [3]
1) Do you know what the problem is?
2) Will multiplying the LSTM output by the weights yield the regression?
Upvotes: 6
Views: 5478
Reputation: 126154
As discussed in the comments, the tf.nn.dynamic_rnn(cell, inputs, ...)
function expects a list of three-dimensional tensors* as its inputs
argument, where the dimensions are interpreted by default as batch_size
x num_timesteps
x num_features
. (If you pass time_major=True
, they are interpreted as num_timesteps
x batch_size
x num_features
.) Therefore the preprocessing you've done in the original placeholder is unnecessary, and you can pass the oriding X
value directly to tf.nn.dynamic_rnn()
.
* Technically it can accept complicated nested structures in addition to lists, but the leaf elements must be three-dimensional tensors.**
** Investigating this turned up a bug in the implementation of tf.nn.dynamic_rnn()
. In principle, it should be sufficient for the inputs to have at least two dimensions, but the time_major=False
path assumes that they have exactly three dimensions when it transposes the input into the time-major form, and it was the error message that this bug inadvertently causes that showed up in your program. We're working on getting that fixed.
Upvotes: 8