Reputation: 1181
Trying to predict the hot water consumption profile of a household using LSTM with Python's Keras library. Watched some tutorials and did a Udemy course, did not find one that helped too much (recommendations appreciated). Since it's just a 1-time problem I don't really want to read a tone of books about this, which is why I was hoping I could count on some assistance by the experts on SO. The task:
The input is a ~1,5-year long consumption profile with 1-minute resolution. I put this profile into a csv and named it "labels.csv". A second csv, called "features.csv" contains, like the name suggests, the most important features: the minute of the day, the hour of day, the day of the week. The idea is that, usually consumption ocurrs between 6am-8am and 6pm-8pm during weekdays, and a little later on the weekend. Other influencing factors like vacation days, month of the year etc. were disregarded. The output should be the consumption profile of the next week, i.e. 10080 rows.
First, I import relevant models and upload the csv files.
import pandas as pd
import plotter
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
features = pd.read_csv('features.csv')
labels = pd.read_csv('labels.csv')
Then I devide it up into training and testing sets:
x_train, x_test, y_train, y_test = train_test_split(features,labels,test_size=0.2)
Now I define my model.
model = Sequential()
Now I add the layers (I still do not know how to decide how many layers I should take and how large they should be, but that I can find out via try and error.):
model.add(LSTM(24,activation='relu',input_shape=(1,3)))
model.add(Dense(1))
Compilig the model as such:
model.compile(loss='mse', optimizer="adam")
Finally, fitting the model:
model.fit(x_train,y_train,epochs=60,verbose=2)
The execution of the last line produces the error:
Traceback (most recent call last):
File "/home/bruno/Desktop/Python Projects/lstm_dhw_data2/lstm.py", line 24, in <module>
model.fit(x_train,y_train,epochs=60,verbose=2)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 952, in fit
batch_size=batch_size)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training.py", line 751, in _standardize_user_data
exception_prefix='input')
File "/usr/local/lib/python3.6/dist-packages/keras/engine/training_utils.py", line 128, in standardize_input_data
'with shape ' + str(data_shape))
ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (838860, 3)
So I don't even get to ...
results=model.predict(x_test)
print(results)
If anyone can maybe point out what I did wrong, point me to a suitable (newbie comprehensible) tutorial or point me to a similar project which I can recycle - I would really appreciate it :)
I added the project to my GitHub
Edit: I also do get lots of deprecation warnings, even though
pip install --upgrade tensorflow
returns that everything is up to date ...
Upvotes: 0
Views: 1380
Reputation: 1599
The only thing that you missed is that in time series you need a sequence as input to your model. So your input shoud have the following shape [batch_size, lenght_sequence, n_features]
. Currently we can consider that your dataset is composed of one big sequence. So you should reshape your dataset to have more than 1 sequences to fit the model. For example using a TimeseriesGenerator
from keras (doc here you can create from your dataset sequences of length 10 (or whatever parameter fit the best your data) as follow :
from keras.preprocessing.sequence import TimeseriesGenerator
sequence_length = 10
data_gen = TimeseriesGenerator(x_train, y_train,
length=sequence_length,
batch_size=16)
model = Sequential()
model.add(LSTM(24,activation='relu',input_shape=(sequence_length, 3)))
model.add(Dense(1))
model.compile(loss='mse', optimizer="adam")
model.fit_generator(data_gen)
Upvotes: 2