Reputation: 587
I am trying to make a proof of concept, on predicting how full a parking area is. I am trying to use Keras to create a LSTM neural network, to predict how full an area will be at a given time.
This is the head of my dataframe:
Time_Stamp Weekday Area Sub_Area Free_Spots Used_Spots Full%
2014-04-10 08:00:00 Yes Ballard Locks NW 54TH SR ST BETWEEN 32ND AVE NW AND NW 54TH ST 68.0 1.0 1.0
2014-04-10 09:00:00 Yes Ballard Locks NW 54TH SR ST BETWEEN 32ND AVE NW AND NW 54TH ST 68.0 2.0 3.0
2014-04-10 10:00:00 Yes Ballard Locks NW 54TH SR ST BETWEEN 32ND AVE NW AND NW 54TH ST 12.0 0.0 0.0
2014-04-10 11:00:00 Yes Ballard Locks NW 54TH SR ST BETWEEN 32ND AVE NW AND NW 54TH ST 12.0 0.0 0.0
2014-04-10 12:00:00 Yes Ballard Locks NW 54TH SR ST BETWEEN 32ND AVE NW AND NW 54TH ST 12.0 0.0 0.0
I run the following code:
from sklearn.model_selection import train_test_split
TRAIN,TEST,notused,notused = train_test_split(df['data']['Full%'],
df['data']['Full%'],
test_size=0.25)
TRAIN.sort_index(inplace=True)
TEST.sort_index(inplace=True)
.
# create train lists
x_train = []
y_train = []
# create test lists
x_test = []
y_test = []
# fill the train lists
for i in range(len(TRAIN)-1):
x_train.append(TRAIN[i])
y_train.append(TRAIN[i+1])
# fill the test lists
for i in range(len(TEST)-1):
x_test.append(TEST[i])
y_test.append(TEST[i+1])
# change the lists to numpy arrays
x_train, y_train = np.array(x_train), np.array(y_train)
x_test, y_test = np.array(x_test), np.array(y_test)
the next part is where i cant get this to work.
x_train = x_train.reshape(1,56,1)
y_train = x_train.reshape(1,56,1)
model = Sequential()
model.add(LSTM(56, input_dim=56,return_sequences=True))
model.add(Dense(56))
model.compile(loss='mean_absolute_error', optimizer='adam',metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10000, batch_size=1, verbose=2,validation_data=(x_test, y_test))
I have been playing around, but the error keeps being some sort of value error:
ValueError: Error when checking input: expected lstm_24_input to have shape (None, None, 56) but got array with shape (1, 56, 1)
Now i have a couple of questions, besides what is wrong with my code:
It seems to be a problem that my train and test data has different sizes, because the input dimension will not be the same. How should i deal with this?
The datetime time stamp is not a part of my train/test data, and because this dataset is an actual one ( the data has been taken from this dataset:https://github.com/bok11/IS-Data-Analasys/blob/master/Data/Annual_Parking_Study_Data.csv), the time between each observation varies. Is this okay?
A full view of my notebook can be seen here: https://github.com/bok11/IS-Data-Analasys/blob/master/Data%20Exploration%20(Part%202).ipynb
EDIT: The goal of my task is to prove, if it would be viable to collect this data to predict parking areas.
Upvotes: 0
Views: 463
Reputation: 86600
The message says that your input data (numpy arrays) has shape (1,56,1)
, while your model is expecting shape (any, any, 56)
.
In recurrent networks, the input shape should be like (batch size, time steps, input features)
.
So, you need to decide whether you've got 56 time steps of the same feature, or if you've got only one time steps of 56 different features. Then you pick one of the two shapes to adjust.
It seems logical (if you're using LSTMs), that you have sequences, so I assume you've got 56 time steps.
Then, your input shape in the LSTM layer should be:
LSTM(doesntMatter, input_shape=(56,1), return_sequences=True)
Or (if you want a variable number of steps):
LSTM(doesntMatter, input_shape=(None,1), return_sequences=True)
Suppose you want more than one info, such as Date and Weekday, for instance. Then you've got two features. Your shape would be then input_shape(None,2)
.
Upvotes: 1