Reputation: 482
Assume I want to classify time series, each of them has 33 time steps. I splitted them up into smaller chunks.
So let's say I have the following Input X_1
, the dimension is (32,3,1)
, so 32 samples, 3 time steps, 1 feature:
[
[[1], [2], [3]] # step 1 to step 3 from time series 1
[[11], [14], [17]] # step 1 to step 3 from time series 2
[[3], [5], [7]] # step 1 to step 3 from time series 3
...
[[9], [7], [2]] # step 1 to step 3 from time series 32
]
and Y = [A, A, B, …, B]
containing the labels for each of he 32 time series in this batch.
Now I run model.fit(X_1, Y)
.
Then I take the next 3 time steps for each time series as X_2
:
[
[[4], [5], [6]] # step 4 to step 6 from time series 1
[[20], [23], [26]] # step 4 to step 6 from time series 2
[[9], [11], [13]] # step 4 to step 6 from time series 3
...
[[8], [1], [9]] # step 4 to step 6 from time series 32
]
and again the same Y = [A, A, B, …, B]
.
Because I've splitted the time series up I use the stateful model, so that the state from X_1
is being saved for X_2
.
Again I run model.fit(X_2, Y)
. I repeat this until I reach X_11
containing time step 31 to 33 from my input data. After I called model.fit(X_11, y)
I'm gonna call model.reset_states()
because I'm done with the first batch of 32 time series, and I can start again at the beginning with a new batch of 32 time series.
At least until now I thought that this is the way to do this. But now I read, that the state is preserved by default across samples in a batch, so does that means that the state from the first 3 steps of time series 1 in X_0
is also used for the first 3 steps from time series 2? Because that wouldn't make sense, they have nothing in common, the state shouldn't be shared across them. So what is correct?
Upvotes: 0
Views: 52
Reputation: 86600
No, states are matrices whose one of the dimensions is the batch size, meaning there is one row of states per sample.
Series 1 does not communicate with series 2.
Upvotes: 1