Reputation: 363
I am running a regression problem on sensor data having four columns using the LSTM framework. I have not yet used any regularization.
The code I used specified below;
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import math
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from keras import callbacks
from keras.layers import Flatten
# load the dataset
gbx_data = pd.read_csv('/home/prm/Downloads/aggregated_vibration.csv', usecols=[4,5,6,7])
dataset = gbx_data.values
dataset = dataset.astype('float32')
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)
train_size = int(len(dataset) * 0.63)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
print(len(train), len(test))
def create_dataset(dataset, look_back):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), :]
dataX.append(a)
dataY.append(dataset[i + look_back, :])
return np.array(dataX), np.array(dataY)
look_back = 10
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
trainX = trainX.reshape(trainX.shape[0], look_back, trainX.shape[2]) # model input shape & model output shape will be same always #
testX = testX.reshape(testX.shape[0], look_back, testX.shape[2])
batch_size = 120
class LossHistory(keras.callbacks.Callback):
def on_train_begin(self, logs={}):
self.losses = []
def on_epoch_end(self, epoch, logs={}):
self.losses.append(logs.get('loss'))
model=Sequential()
model.add(LSTM(10, return_sequences=True, input_shape=(look_back, 4), activation='relu'))
model.add(Dropout(0.2))
model.add(LSTM(12, return_sequences=True, input_shape=(look_back, 4), activation='relu'))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(4, activation='relu'))
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
history = LossHistory()
model.fit(trainX, trainY, epochs=10, batch_size=batch_size, callbacks=[history])
print(history.losses)
I will like to know the specifications for the following questions;
LossHistory
class. How can I get the weights after each epoch? I know model.get_weights()
gives me all the weights. But how can I get them after each epoch?model.get_config()
gives me 'stateful': False
. If I execute a statefull LSTM, what change will actually occur & checking which values I can understand the change?return_sequences=False
what change will occur?Running the above code, the loss history after 10 epochs are as follow,
[0.016399867401633194, 0.0029856997435597997, 0.0021351441705040426, 0.0016288172078515754, 0.0012535296516730061, 0.0010065438170736181, 0.00085688360991555948, 0.0007937529246583822, 0.00073356743746738303, 0.00069794598373472037]
with accuracy 77%.
I am adding the table of several iterative approaches as well.
Sorry If I asked a lot. Please share your assistance if possible.
Upvotes: 0
Views: 950
Reputation: 86600
For doing things after each epoch, you can use a Callback
, especially the LambdaCallback, which allows a very flexible usage.
Define a lambda callback that will get weights after each epoch:
getWeightsCallback = LambdaCallback(on_epoch_end=getWeightsFunction)
Where:
myWeights = []
def getWeightsFunction(epoch,logs):
#adapt this code:
myWeights.append(model.get_weights())
Then add the callback to your fit
method:
model.fit(....., callbacks=[getWeightsCallback])
Unfortunately, I can't answer that, if there is an answer. But I do believe it's an intuitive thing, and it should be experimented until you find what is best for your specific task and model.
What I know, though, is about the last layer. This one is completely related to the final task.
A classification problem, with only one true class among many, benefits from using activation='softmax'
and loss='categorical_crossentropy'
A classification problem with many true classes normally uses activation='sigmoid'
and loss='binary_crossentropy'
Other problems should have better options too, depending on the application.
Recurrent networks have an "inner state", which is roughly the "memory" built from stepping through a sequence.
This state is unique for each sequence. Each sequence builds a state.
The idea of not resetting the states is to be able to divide each sequence in batches. If your sequences are too long (causing RAM or performance issues), you divide them in parts, and the model will understand that your batches are not "new sequences", but "sequels of the previous sequences".
The noticeable changes are the need of defining extra parameters such as batch size and passing the data correctly in sequence.
Keras documentation
An indirectly related question
Since recurrent networks work in time steps, every time step has a result.
You may choose to output all these results, ending up with a sequence (same number of time steps as the input). Or you may choose to get only the final result, discarding the time steps:
Sorry, that is definitely an open question. It's totally dependent on what you want to do, the size of your data, the architecture of your model.
There is really no ready answer. Creating a perfect architecture for a certain application is exactly what everyone is seeking around the world.
You can make experiments or try to find papers that work with the same kind of thing you are working to see what are the best practices "so far".
Some related problems:
Upvotes: 1