Reputation: 451
I want to ask if there is an optimal sequence length of a LSTM network in general, or in terms of time series prediction problems?
I read about vanishing gradient or exploding gradient problems that very long RNN networks had and LSTM tried to solve and succeeded to a certain extent.
I also heard about techniques to handle very large sequences with LSTM’s and RNN’s in general like: truncating sequences, summarizing sequences, truncating backpropagation through time or even using an Encoder-Decoder architecture.
I asked this question because I didn’t find a research paper about this, only this blog post that stated an optimal sequence length between 10-30.
Upvotes: 1
Views: 3653
Reputation: 1009
TLDR: Just try it out.
Because training is already very computationally expensive, the easiest way to calculate how successful a model would be is to test it out. The combination that works best cannot be easily predetermined, especially not with such a vague description (or no description at all) of how the actual problem looks like.
From this answer:
It totally depends on the nature of your data and the inner correlations, there is no rule of thumb. However, given that you have a large amount of data a 2-layer LSTM can model a large body of time series problems / benchmarks.
So in your case, you might want to try sequence lengths from 10 - 30. But I'd also try and evaluate how your training algorithm performs outside of that recommendation by the post you linked.
Upvotes: 1