jstaker7
jstaker7

Reputation: 1236

What is the approach for ensembling recurrent neural networks?

It is usually fairly easy to ensemble multiple deep networks together to improve the statistics during prediction. This is often as simple as taking the output predictions and averaging them together. In a recurrent neural network this isn't as straightforward since we are making predictions over a sequence of outputs.

How do we ensemble recurrent neural networks? Do you predict the outputs at each time step using multiple models, average the outputs, then use the prediction from the average to feed back into each separate model (rinse, repeat)? This seems like it would be fairly cumbersome to implement in common ML libraries (I'm using Tensorflow).

Upvotes: 1

Views: 631

Answers (1)

Denny
Denny

Reputation: 1230

It seems like what you are talking about can be summarized as "decoding strategies" for RNNs. For example:

  1. You pick the highest probability word from a single model and feed it and choose it as the next input (argmax decoding).
  2. You could sample a word from the output probability distribution and use it as the next input.
  3. You could do a beam search where you keep around the k best candidate decodings and choose another beam as the next input.
  4. Similar to what you propose, you can use multiple models or some other more complex decoding strategy to pick the next input.

It definitely isn't trivial to implement, but it isn't too bad either. In Tensorflow you would use the raw_rnn function to do this. Basically, it's like a while loop and you can use an arbitrarily complex function to pick the output and the next input for the RNN.

Upvotes: 2

Related Questions