B_Miner
B_Miner

Reputation: 1820

Keras variable input

Im working through a Keras example at https://www.tensorflow.org/tutorials/text/text_generation

The model is built here:

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim,
                                  batch_input_shape=[batch_size, None]),
        tf.keras.layers.GRU(rnn_units,
                            return_sequences=True,
                            stateful=True,
                            recurrent_initializer='glorot_uniform'),
        tf.keras.layers.Dense(vocab_size)
    ])
    return model

During training, they always pass in a length 100 array of ints.

But during prediction, they are able to pass in any length of input and the output is the same length as the input. I was always under the impression that the lengths of the time steps had to be the same. Is that not the case and the # of time steps of the RNN somehow can change?

Upvotes: 1

Views: 140

Answers (2)

mujjiga
mujjiga

Reputation: 16856

RNNs are sequence models, ie. they take in a sequence of input and give out a sequence of outputs. The sequence length is also called the time steps is number of time the RNN cell is unwrapped and for each unwrapping an input is passed and RNN cell using its gates gives out an output (per each unwrapping). So in theory you can have as long sequence as you want. Now lets assume you have different inputs of different size, since you cannot have variable size inputs in a single batches you have to collect the inputs of same size an make a batch if you want to train using batches. You can as well use batch size of 1 and not worry about all this, but training become painfully slow.

In ptractical situations, while training we divide input into same sizes so that training become fast. There are situations like language translation models where this is not feasible.

So in theory RNNs does not have any limitation on the sequence length, however large sequence will start to loose the context at the begging as the sequence length increases.

While predictions you can use any sequence length you want to.

In you case your output size is same as input size because of return_sequences=True. You can as well have single output by using return_sequences=False where in only the output of last unwrapping is returned by keras.

Upvotes: 2

Andrey
Andrey

Reputation: 6367

Length of training sequences should not be equal to predicted length.

RNN deals with two vectors: new word and hidden state (accumulated from the previous words). It doesn't keep length of sequence.

But to get good prediction of long sequences - you have to train RNN with long sequences - because RNN should learn a long context.

Upvotes: 0

Related Questions