Reputation: 11422
My input data consists of 10 samples, each of which has 200 time steps, while each time step is described by a vector of 30 dimensions. In addition, each time step consists of a 3 dimensional vector (one hot encoding) which describes the action which has been taken at that particular time step. With that being said, I am trying to build a model which get fed in all previous actions and then predicts which action would be the best to take next.
I tried to get this working with tflearn and tensorflow but with limited success so far.
Simple sample code:
import numpy as np
import operator
import tflearn
from tflearn import regression
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.embedding_ops import embedding
from tflearn.layers.recurrent import bidirectional_rnn, BasicLSTMCell
from tflearn.data_utils import to_categorical, pad_sequences
SAMPLES = 10
TIME_STEPS = 200
DATA_DIMENSIONS = 30
LABEL_CLASSES = 3
x = []
y = []
# Generate fake data.
for i in range(SAMPLES):
sequences = []
outputs = []
for i in range(TIME_STEPS):
d = []
for i in range(DATA_DIMENSIONS):
d.append(1)
sequences.append(d)
outputs.append([0,0,1])
x.append(sequences)
y.append(outputs)
print("X1:", len(x), ", X2:", len(x[0]), ", X3:", len(x[0][0]))
print("Y1:", len(y), ", Y2:", len(y[0]), ", Y3:", len(y[0][0]))
# Define model
net = tflearn.input_data([None, TIME_STEPS, DATA_DIMENSIONS], name='input')
net = tflearn.lstm(net, 128, dropout=0.8, return_seq=True)
net = tflearn.fully_connected(net, LABEL_CLASSES, activation='softmax')
net = tflearn.regression(net, optimizer='adam', loss='categorical_crossentropy', name='targets')
model = tflearn.DNN(net)
# Fit model.
model.fit({'input': x}, {'targets': y},
n_epoch=1,
snapshot_step=1000,
show_metric=True, run_id='test', batch_size=32)
Error
ValueError: Cannot feed value of shape (10, 200, 3) for Tensor 'targets/Y:0', which has shape '(?, 3)'
As far as I understand, the input_data should be correct. However, the output data is apparently wrong, at least, Tensorflow throws an error. That is probably because my model expects one label per sample rather than one label per time step.
Can I even achieve my goal with an LSTM, and if so, how do I have to set up my model?
Thanks, Robert
Upvotes: 0
Views: 1470
Reputation: 3094
As the error suggests, there is a shape mismatch between the expected size of your targets tensor, and the one of the data you actually provide for it. Let us break it down.
From what I understand, you have labeled action for every timestep of your sequences. This means that the labels that you provide should have a shape (10, 200, 3)
. This seems to be the case from the error message. Good.
So we now know the error comes from what the network generates.
=================
Input data -> (10, 200, 30)
LSTM -> (10, 128)
(because return_seq=False
)
FullyConnected -> (10, 3)
.
=================
So that explains the second part of the error message, your network indeed produces an output with shape (10, 3)
which mismatches the one of your data.
I think you missed the return_seq
argument of the LSTM. As is usually the case with RNN implementations, you have a parameter telling if you want the layer to return outputs for the whole sequence, or only for the last timestep. Here by default it is the second option, that is why you don't get an output with the expected shape. Use return_seq=True
.
Upvotes: 1