Felipe Mello
Felipe Mello

Reputation: 395

Can I train the initial hidden state of a RNN to represent the initial conditions of my model?

I have some time-series data related to a bioreactor. Every 24h I feed glucose to the bioreactor and measure how much of some substances it produced since last feed.

Input: Glucose feed.

Ouput: Production of substances.

Objective: Estimate these substances concentrations over time, given the glucose I fed.

This bioreactor has some initial conditions, like initial concentration of glucose and substances. Each experiment has a different initial condition. In one experiment I can start with 10mM of a substance, and in another I can start with 100mM, so knowing the starting point is important.

I wanted use to this initial condition to train the initial hidden state of my RNN.

Model

Is there anyway that I can do that? If not, are there other ways to express initial conditions to a RNN? I am using python with Keras. Thanks!

In code, I believe it would look something like this:

from tensorflow.keras.layers import Input, Dense, LSTM
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam

input_layer = Input(shape=(16,3))
hidden_state_dim = 7

mlp_inp = Input(batch_shape=(hidden_state_dim,1))
mlp_dense_h = Dense(hidden_state_dim, activation='relu')(mlp_inp)
mlp_dense_c = Dense(hidden_state_dim, activation='relu')(mlp_inp)

x = LSTM(7, return_sequences = True)(input_layer, initial_state=[mlp_dense_h, mlp_dense_c])

model = Model(input_layer, x)

But I receive the ValueError: Graph disconnected. Probably because there is no backpropagation to the mlp_dense_h/c.

Upvotes: 2

Views: 917

Answers (2)

Shu-Bo Yang
Shu-Bo Yang

Reputation: 11

The reason why you receive the error is because you did not incorporate the mlp_inp as one of the input in the "Model". The following revised code can work without error:

from tensorflow.keras.layers import Input, Dense, LSTM
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from keras.utils import plot_model

input_layer = Input(shape=(16,3))
hidden_state_dim = 7

mlp_inp = Input(batch_shape=(hidden_state_dim,1))
mlp_dense_h = Dense(hidden_state_dim, activation='relu')(mlp_inp)
mlp_dense_c = Dense(hidden_state_dim, activation='relu')(mlp_inp)

x = LSTM(7, return_sequences = True)(input_layer, initial_state=[mlp_dense_h, mlp_dense_c])

model = Model(inputs=[input_layer, mlp_inp], outputs=x) # Here is the change
plot_model(model, to_file='IC.png')

enter image description here

I am currently working on the similar RNN problem as you did. Maybe we can discuss more about this interested problem.

Upvotes: 1

Pallavi
Pallavi

Reputation: 596

Initial_state parameter in case of RNN is not what initial state that you have in your problem. Your initial state is very domain specific. In case of RNN, initial state is initial value of hidden state of model. If you want to map your domain's initial state in deep learning model, you need to find the correlation between your initial conditions and output and also, correlation between time-series data variations and output.

I suggest you try following steps -

  1. Use RNN model or better RNN Autoencoder for time-series data as input and output (without initial conditions).
  2. Train this model and grab the intermediate state as your compact representation of time-series data.
  3. Concatenate this representation with your values of initial state and build a feedforward network with input as time-series representation concatenated with initial state and output as values of your outcome i.e. production of substances.
  4. Train this network.

You may have to tweak the models in order to improve accuracy like increasing size of intermediate layer of autoencoder, adding/removing layers in either or both the models, adding regularization etc.

Let me know if you face any problems.

Upvotes: 0

Related Questions