thatWaterGuy
thatWaterGuy

Reputation: 315

MATLAB - Creating a matrix of lagged time step vectors (similar to padding?)

I have a 429x1 vector that represents a hydrological time series. I am looking to "lag" the time series by a time step and turn it into a matrix for input into the nftool for some ANN analysis. The width of the matrix is controlled by the amount of input neurons in my input layer, which is a value I read in from a spread sheet. This is what I would like to do using a shorter time series to illustrate the example:

    inp_neur = 5; % amount of input neurons (read in from excel)
A = [9;6;8;3;2]; % hypothetical hydrological time series

% do pad zero process

RESULT:

newA =

 9     0     0     0     0
 6     9     0     0     0
 8     6     9     0     0
 3     8     6     9     0
 2     3     8     6     9

I'm sure this isn't the hardest thing to do, but can it be done in a one liner?

Any help would be greatly appreciated.

Cheers,

JQ

Another example with inp_neur = 7;

A = [11;35;63;21;45;26;29;84;51]

newA =

11  0   0   0   0   0   0
35  11  0   0   0   0   0
63  35  11  0   0   0   0
21  63  35  11  0   0   0
45  21  63  35  11  0   0
26  45  21  63  35  11  0
29  26  45  21  63  35  11
84  29  26  45  21  63  35
51  84  29  26  45  21  63

Upvotes: 4

Views: 6024

Answers (2)

Colin T Bowers
Colin T Bowers

Reputation: 18560

I know that this question has already been marked accepted, however, I think it is worth pointing out that the current accepted answer will be very inefficient if T (the number of observations in the time series) is much larger than K (the number of lags, ie inp_neur in the OP's notation). This is because it creates a T by T matrix then truncates it to T by K.

I would propose two possible alternatives. The first uses a function from the Econometrics toolbox designed to do exactly what the OP wants: lagmatrix. The second is a loop based solution.

The lagmatrix solution returns NaN where the OP wants 0, so an additional line is necessary to convert them. The full solution is:

newA2 = lagmatrix(A, 0:K-1);
newA2(isnan(newA2)) = 0;

The loop based solution is:

newA3 = zeros(T, K);
for k = 1:K
    newA3(k:end, k) = A(1:end-k+1);
end

The obvious advantage of the loop based solution is that it does not require the econometrics toolbox. But is that the only advantage? Let's try some timed runs. Set T = K = 10. Then:

Elapsed time is 0.045809 seconds. %# 3lectrologos solution
Elapsed time is 0.049845 seconds. %# lagmatrix solution
Elapsed time is 0.017340 seconds. %# loop solution

3lectrologos solution and the lagmatrix solution are essentially the same. The loop based solution is 3 times faster! Now, to emphasize the problem with 3lectrologos solution, set T = 1000 and K = 10. Then:

Elapsed time is 10.615298 seconds.
Elapsed time is 0.149164 seconds.
Elapsed time is 0.056074 seconds.

Now 3lectrologos solution is two orders of magnitude slower than the lagmatrix solution. But the real winner on the day is the loop based solution that still manages to be 3 times faster than the lagmatrix solution.

Conclusion: Don't discount single-loops in Matlab anymore. They are getting really fast!

For those who are interested, the code for the timed runs is below:

M = 1000; %# Number of iterations for timed test
T = 1000; %# Length of your vector of inputs
K = 10; %# Equivalent to your inp_neur
A = randi(20, T, 1); %# Generate random data

%# 3lectrologos solution (inefficient if T is large relative to K)
tic
for m = 1:M
    tmp = tril(toeplitz(A));
    newA1 = tmp(:, 1:K);
end
toc

%# lagmatrix solution
tic
for m = 1:M
    newA2 = lagmatrix(A, 0:K-1);
    newA2(isnan(newA2)) = 0;
end
toc

%# Loop based solution
tic
for m = 1:M
    newA3 = zeros(T, K);
    for k = 1:K
        newA3(k:end, k) = A(1:end-k+1);
    end
end
toc

Upvotes: 3

3lectrologos
3lectrologos

Reputation: 9652

Here's a two liner:

tmp = tril(toeplitz(A));
newA = tmp(:, 1:inp_neur);

Upvotes: 2

Related Questions