Adam
Adam

Reputation: 990

matlab make matrix from vectors with gaps

I have two sets of time-series data, indexed by dates in the first columns; values are in the second. Dates are represented in yyyymmdd format (e.g. Pearl Harbor Day is 19411207) The two time series overlap, but neither is a subset of the other, and even in the overlap, there may be dates missing, and due to the format, there are gaps in the numerical sequence. Since these dates end up just being integers, I will substitute small numbers below for ease of reading.

I want to turn these two matrices (a 26622x2 and a 38067x2) into a single three-column matrix (38103x3, as it turns out.)

The merge I am after is this:

 Index AVal             Index BVal             Index AVal   BVal
   1     2.5               3    6.5             1     2.5
   2     3.4     +         5    8.9   =         2     3.4
   4     5.6               7    9.1             3            6.5
   5     7.8               8    7.1             4     5.6
   7     8.00                                   5     7.8    8.9
                                                7     8.00   9.1
                                                8            7.1

In Excel, I would do this with VLookup. I can imagine an approach in Matlab that uses lots of conditionals, loops and stored indices, but I am wondering if there isn't possibly a very simple set of join/intersect-type commands that would accomplish the same thing.

Any thoughts?

Upvotes: 2

Views: 207

Answers (1)

chappjc
chappjc

Reputation: 30579

Try setdiff to identify the dates in B not in A, then concatenate with NaNs or some other value for missing values:

A = [1 2.5; 2 3.4; 4 5.6; 5 7.8; 7 8.0];
B = [3 6.5; 5 8.9; 7 9.1; 8 7.1];

[BnA,iB] = setdiff(B(:,1),A(:,1));
C = [A NaN(size(A,1),1); BnA NaN(numel(BnA),1) B(iB,2)]
C =
    1.0000    2.5000       NaN
    2.0000    3.4000       NaN
    4.0000    5.6000       NaN
    5.0000    7.8000       NaN
    7.0000    8.0000       NaN
    3.0000       NaN    6.5000
    8.0000       NaN    7.100

Then use intersect to handle values that are common to both:

[AB,iA,iBA] = intersect(A(:,1),B(:,1));
C(iA,3) = B(iBA,2)
C =
    1.0000    2.5000       NaN
    2.0000    3.4000       NaN
    4.0000    5.6000       NaN
    5.0000    7.8000    8.9000
    7.0000    8.0000    9.1000
    3.0000       NaN    6.5000
    8.0000       NaN    7.1000

Then use sortrows to sort according to to the first column:

C = sortrows(C,1)
C =
    1.0000    2.5000       NaN
    2.0000    3.4000       NaN
    3.0000       NaN    6.5000
    4.0000    5.6000       NaN
    5.0000    7.8000    8.9000
    7.0000    8.0000    9.1000
    8.0000       NaN    7.1000

Upvotes: 3

Related Questions