mk_sch
mk_sch

Reputation: 1070

Rolling-window matrix with different intervals between columns

I have a vector of data for 21 years with daily data and want to create a rolling window of 365 days such as the next period stars one month (30 days) after the previous one. In the question, n_interval defines the difference between the first data point of the next window and the last observation of the previous series.

Let's assume my daily data start from Jan. 1 2000, then the first column would be Jan. 1, 2000 -Jan.1, 2001 and the second column starts from Feb. 1, 2000. and ends on Feb. 1, 2001. and ... the last column will cover Jan. 1, 2017 to Jan. 1, 2018. for example if:

vec = [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17]

for a given variable n_interval = 3, with window_size=5, the output matrix should look like:

mat = [[1 4 7  10  13],
       [2 5 8  11  14],
       [3 6 9  12  15],
       [4 7 10 13  16],
       [5 8 11 14  17]]

Upvotes: 0

Views: 208

Answers (1)

jodag
jodag

Reputation: 22214

Given your example vector

vec = [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17];

we can create an indexing scheme by as follows:

First, we need to determine how many rows there will be in the mat. Assuming we want every element of vec to be expressed in mat at least once then we need to make sure that last index in the last row is greater than or equal to the size of vec. It's fairly easy to see that the index of the last column in mat is described by

last_index = n_interval*(n_rows-1) + n_columns

We want to ensure that last_index >= numel(vec). Substituting in the above expression into the inequality and solving for n_rows gives

n_rows >= (numel(vec) - n_columns)/n_interval + 1

We assign n_rows to be the ceil of this bound so that it is the smallest integer which satisfies the inequality. Now that we know the number of rows we generate the list of starting indices for each row

start_index = 1:n_interval:(n_interval*(n_rows-1)+1);

In the index matrix we want each column to be 1 plus the previous column. In other words we want to offset the column according to the array index_offset = 0:(n_interval-1).

Using bsxfun we generate the index matrix by computing the sums of all pairs between the start_index and index_offset arrays

index = bsxfun(@plus, index_offset, start_index');

The final thing we need to worry about is going out of bounds. To handle this we apply the mod function to wrap the out of bounds indicies:

index_wrapped = mod(index-1, numel(vec))+1;

Then we simply sample the vector according to index_wrapped

mat = vec(index_wrapped);

The complete code is

n_interval = 3;
n_columns = 5;
vec = 1:17;

n_rows = ceil((numel(vec)-n_columns)/n_interval + 1);
start_index = 1:n_interval:(n_interval*(n_rows-1)+1);
index_offset = 0:(n_columns-1);
index = bsxfun(@plus, index_offset, start_index');
index_wrapped = mod(index-1, numel(vec))+1;
mat = vec(index_wrapped);

Upvotes: 2

Related Questions