The Diagram explanation of the LSTM Network

Question

I am working with LSTM for my time series forecasting problem. I have the following network:

model = Sequential()
model.add(LSTM(units_size=300, activation=activation, input_shape=(20, 1)))
model.add(Dense(20))

My forecasting problem is to forecast the next 20 time steps looking back the last 20 time steps. So, for each iteration, I have an input shape like (x_t-20...x_t) and forecast the next (x_t+1...x_t+20). For the hidden layer, I use 300 hidden units.

As LSTM is different than the simple feed-forward neural network, I cannot understand how those 300 hidden units used for the LSTM cells and how the output comes out. Are there 20 LSTM cells and 300 units for each cell? How is the output generated from these cells? As I describe above, I have 20 time steps to predict and are these all steps generated from the last LSTM cels? I have no idea. Can some generally give a diagram example of this kind of network structure?

Manoj Mohan · Accepted Answer

Regarding these questions,

I cannot understand how those 300 hidden units used for the LSTM cells and how the output comes out. Are there 20 LSTM cells and 300 units for each cell? How is the output generated from these cells?

It is simpler to consider the LSTM layer you have defined as a single block. This diagram is heavily borrowed from Francois Chollet's Deep Learning with Python book:

In your model, input shape is defined as (20,1), so you have 20 time-steps of size 1. For a moment, consider that the output Dense layer is not present.

model = Sequential()
model.add(LSTM(300, input_shape=(20,1)))
model.summary()

lstm_7 (LSTM) (None, 300) 362400

The output shape of the LSTM layer is 300 which means the output is of size 300.

output = model.predict(np.zeros((1, 20, 1)))
print(output.shape)

(1, 300)

input (1,20,1) => batch size = 1, time-steps = 20, input-feature-size = 1.

output (1, 300) => batch size = 1, output-feature-size = 300

Keras recurrently ran the LSTM for 20 time-steps and generated an output of size (300). In the diagram above, this is Output t+19.

Now, if you add the Dense layer after LSTM, the output will be of size 20 which is straightforward.

The Diagram explanation of the LSTM Network

Answers (2)

Related Questions