Reputation: 990
I have two sets of time-series data, indexed by dates in the first columns; values are in the second. Dates are represented in yyyymmdd format (e.g. Pearl Harbor Day is 19411207) The two time series overlap, but neither is a subset of the other, and even in the overlap, there may be dates missing, and due to the format, there are gaps in the numerical sequence. Since these dates end up just being integers, I will substitute small numbers below for ease of reading.
I want to turn these two matrices (a 26622x2 and a 38067x2) into a single three-column matrix (38103x3, as it turns out.)
The merge I am after is this:
Index AVal Index BVal Index AVal BVal
1 2.5 3 6.5 1 2.5
2 3.4 + 5 8.9 = 2 3.4
4 5.6 7 9.1 3 6.5
5 7.8 8 7.1 4 5.6
7 8.00 5 7.8 8.9
7 8.00 9.1
8 7.1
In Excel, I would do this with VLookup. I can imagine an approach in Matlab that uses lots of conditionals, loops and stored indices, but I am wondering if there isn't possibly a very simple set of join/intersect-type commands that would accomplish the same thing.
Any thoughts?
Upvotes: 2
Views: 207
Reputation: 30579
Try setdiff
to identify the dates in B
not in A
, then concatenate with NaN
s or some other value for missing values:
A = [1 2.5; 2 3.4; 4 5.6; 5 7.8; 7 8.0];
B = [3 6.5; 5 8.9; 7 9.1; 8 7.1];
[BnA,iB] = setdiff(B(:,1),A(:,1));
C = [A NaN(size(A,1),1); BnA NaN(numel(BnA),1) B(iB,2)]
C =
1.0000 2.5000 NaN
2.0000 3.4000 NaN
4.0000 5.6000 NaN
5.0000 7.8000 NaN
7.0000 8.0000 NaN
3.0000 NaN 6.5000
8.0000 NaN 7.100
Then use intersect
to handle values that are common to both:
[AB,iA,iBA] = intersect(A(:,1),B(:,1));
C(iA,3) = B(iBA,2)
C =
1.0000 2.5000 NaN
2.0000 3.4000 NaN
4.0000 5.6000 NaN
5.0000 7.8000 8.9000
7.0000 8.0000 9.1000
3.0000 NaN 6.5000
8.0000 NaN 7.1000
Then use sortrows
to sort according to to the first column:
C = sortrows(C,1)
C =
1.0000 2.5000 NaN
2.0000 3.4000 NaN
3.0000 NaN 6.5000
4.0000 5.6000 NaN
5.0000 7.8000 8.9000
7.0000 8.0000 9.1000
8.0000 NaN 7.1000
Upvotes: 3