Reputation: 1070
I have a vector of data for 21 years with daily data and want to create a rolling window of 365 days such as the next period stars one month (30 days) after the previous one. In the question, n_interval
defines the difference between the first data point of the next window and the last observation of the previous series.
Let's assume my daily data start from Jan. 1 2000, then the first column would be Jan. 1, 2000 -Jan.1, 2001 and the second column starts from Feb. 1, 2000. and ends on Feb. 1, 2001. and ... the last column will cover Jan. 1, 2017 to Jan. 1, 2018. for example if:
vec = [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17]
for a given variable n_interval = 3
, with window_size=5
, the output matrix should look like:
mat = [[1 4 7 10 13],
[2 5 8 11 14],
[3 6 9 12 15],
[4 7 10 13 16],
[5 8 11 14 17]]
Upvotes: 0
Views: 208
Reputation: 22214
Given your example vector
vec = [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17];
we can create an indexing scheme by as follows:
First, we need to determine how many rows there will be in the mat
. Assuming we want every element of vec
to be expressed in mat
at least once then we need to make sure that last index in the last row is greater than or equal to the size of vec
. It's fairly easy to see that the index of the last column in mat
is described by
last_index = n_interval*(n_rows-1) + n_columns
We want to ensure that last_index >= numel(vec)
. Substituting in the above expression into the inequality and solving for n_rows
gives
n_rows >= (numel(vec) - n_columns)/n_interval + 1
We assign n_rows
to be the ceil
of this bound so that it is the smallest integer which satisfies the inequality. Now that we know the number of rows we generate the list of starting indices for each row
start_index = 1:n_interval:(n_interval*(n_rows-1)+1);
In the index matrix we want each column to be 1 plus the previous column. In other words we want to offset the column according to the array index_offset = 0:(n_interval-1)
.
Using bsxfun
we generate the index matrix by computing the sums of all pairs between the start_index
and index_offset
arrays
index = bsxfun(@plus, index_offset, start_index');
The final thing we need to worry about is going out of bounds. To handle this we apply the mod
function to wrap the out of bounds indicies:
index_wrapped = mod(index-1, numel(vec))+1;
Then we simply sample the vector according to index_wrapped
mat = vec(index_wrapped);
The complete code is
n_interval = 3;
n_columns = 5;
vec = 1:17;
n_rows = ceil((numel(vec)-n_columns)/n_interval + 1);
start_index = 1:n_interval:(n_interval*(n_rows-1)+1);
index_offset = 0:(n_columns-1);
index = bsxfun(@plus, index_offset, start_index');
index_wrapped = mod(index-1, numel(vec))+1;
mat = vec(index_wrapped);
Upvotes: 2