Reputation: 153
I am trying to wrap my head around the shape needed for my specific task. I am attempting to train a qlearner on some time series data which is contained in a dataframe. My dataframe has the following columns: open, close, high, low and I am trying to get a sliding window of say 50x timesteps. Here is example code for each window:
window = df.iloc[0:50]
df_norm = (window - window.mean()) / (window.max() - window.min())
x = df_norm.values
x = np.expand_dims(x, axis=0)
print x.shape
#(1,50, 4)
Now that I know my shape is (1,50,4) for each item in X I'm at a loss for what shape I feed my model. Lets say I have the following:
model = Sequential()
model.add(LSTM(32, return_sequences=True, input_shape=(50,4)))
model.add(LSTM(32, return_sequences=True))
model.add(Dense(num_actions))
Gives the following error
ValueError: could not broadcast input array from shape (50,4) into shape (1,50)
And here is another attempt:
model = Sequential()
model.add(Dense(hidden_size, input_shape=(50,4), activation='relu'))
model.add(Dense(hidden_size, activation='relu'))
model.add(Dense(num_actions))
model.compile(sgd(lr=.2), "mse")
which gives the following error:
ValueError: could not broadcast input array from shape (50,4) into shape (1,50))
Here is the shape the model is expecting and the state from my env:
print "Inputs: {}".format(model.input_shape)
print "actual: {}".format(env.state.shape)
#Inputs: (None, 50, 4)
#actual: (1, 50, 4)
Can someone explain where I am going wrong with the shapes here?
Upvotes: 4
Views: 936
Reputation: 54521
The recurrent layer takes inputs of shape (batch_size, timesteps, input_features)
. Since the shape of x
is (1, 50, 4)
, the data should be interpreted as a single batch of 50 timesteps, each containing 4 features. When initializing the first layer of a model, you pass an input_shape
: a tuple specifying the shape of the input, excluding the batch_size
dimension. In the case of LSTM layers, you can pass None
as the timesteps
dimension. Hence, this is how the first layer of the network should be initialized:
model.add(LSTM(32, return_sequences=True, input_shape=(None, 4)))
The second LSTM layer is followed by a dense layer. So you don't need to return sequences for this layer. Hence, this is how you should initialize the second LSTM layer:
model.add(LSTM(32))
Every batch of 50 time steps in x
is supposed to be mapped to a single action vector in y
. Therefore, since the shape of x
is (1, 50, 4)
, the shape of y
must be (1, num_actions)
. Make sure y
doesn't have the timesteps
dimension.
Therefore, under the assumption that x
and y
have the right shapes, the following code should work:
model = Sequential()
model.add(LSTM(32, return_sequences=True, input_shape=(None, 4)))
model.add(LSTM(32))
model.add(Dense(num_actions))
model.compile(sgd(lr=.2), "mse")
# x.shape == (1, 50, 4)
# y.shape == (1, num_actions)
history = model.fit(x, y)
Upvotes: 2