Reputation: 13
I am running into issues preparing my data for use in Keras's LSTM layer. The data is a 1,600,000 item time-series csv consisting of a date and three features:
Date F1 F2 F3
2016-03-01 .252 .316 .690
2016-03-02 .276 .305 .691
2016-03-03 .284 .278 .687
...
My goal is to predict the value of F1 prediction_period timesteps in the future. Understanding that Keras's LSTM layer takes import data in the format (samples,timesteps,dimensions) I wrote the following function to convert my data into a 3D numpy array in this format (Using 2016-03-03 as an example):
[[[.284, .278, .687], [.276, .305, .691], [.252, .316, .690]],...other samples...]
This function creates the array by stacking copies of the data, with each copy shifted one step further back in time. Lookback is the number of "layers" in the stack and trainpercent is train/test split:
def loaddata(path):
df = pd.read_csv(path)
df.drop(['Date'], axis=1, inplace=True)
df['label'] = df.F1.shift(periods=-prediction_period)
df.dropna(inplace=True)
df_train, df_test = df.iloc[:int(trainpercent * len(df))], df.iloc[int(trainpercent * len(df)):]
train_X, train_Y = df_train.drop('label', axis=1).copy(), df_train[['label']].copy()
test_X, test_Y = df_test.drop('label', axis=1).copy(), df_test[['label']].copy()
train_X, train_Y, test_X, test_Y = train_X.as_matrix(), train_Y.as_matrix(), test_X.as_matrix(), test_Y.as_matrix()
train_X, train_Y, test_X, test_Y = train_X.astype('float32'), train_Y.astype('float32'), test_X.astype('float32'), test_Y.astype('float32')
train_X, test_X = stackit(train_X), stackit(test_X)
train_X, test_X = train_X[:, lookback:, :], test_X[:, lookback:, :]
train_Y, test_Y = train_Y[lookback:, :], test_Y[lookback:, :]
train_X = np.reshape(train_X, (train_X.shape[1], train_X.shape[0], train_X.shape[2]))
test_X = np.reshape(test_X, (test_X.shape[1], test_X.shape[0], test_X.shape[2]))
train_Y, test_Y = np.reshape(train_Y, (train_Y.shape[0])), np.reshape(test_Y, (test_Y.shape[0]))
return train_X, train_Y, test_X, test_Y
def stackit(thearray):
thelist = []
for i in range(lookback):
thelist.append(np.roll(thearray, shift=i, axis=0))
thelist = tuple(thelist)
thestack = np.stack(thelist)
return thestack
While the network accepted the data and did train, the loss values were exceptionally high, which was very surprising considering that the data has a definite periodic trend. To try and isolate the problem, I replaced my dataset and network structure with a sin-wave dataset and structure from this example: http://www.jakob-aungiers.com/articles/a/LSTM-Neural-Network-for-Time-Series-Prediction.
Even with the sin wave dataset, the loss was still orders of magnitude higher that were produced by the example function. I went through the function piece by piece, using a one column sequential dataset and compared expected values with the actual values. I didn't find any errors.
Am I structuring my input data incorrectly for Keras's LSTM layer? If so, what is the proper way to do this? If not, what would you expect to cause these symptoms (extremely high loss which does not decrease over time, even with 40+ epochs) in my function or otherwise.
Thanks in advance for any advice you can provide!
Upvotes: 1
Views: 1108
Reputation: 75
Here are some things you can do to improve your predictions:
First make sure you input data is centered i.e. apply some standardization or normalization. You can either use the MinMaxScaler or StandardScaler from sklearn library or implement some custom scaling based on your data.
Make sure your network(LSTM/GRU/RNN) is big enough to capture the complexity in your data.
Use the tensorboard callback in Keras to monitor your weight matrices and loss functions.
Use an adaptive optimizer instead of setting custom learning parameters. Maybe'adam' or 'adagrad' .
Using these will at least make sure that your network is training. You should see gradual decrease of losses over time. After you've solved this problem you are free to experiment with your initial hyper-parameters and implementing different regularization techniques
Good Luck !
Upvotes: 1
Reputation: 11553
A "high loss" is a very subjective thing. We can not assess this without seeing your model.
It can come from multiple reasons:
You see that there are plenty of possibilities. A high loss doesn't mean anything in itself. You can have a really small loss and just do + 1000 and your loss will be high eventhough the problem is solved
Upvotes: 0