Reputation: 8419
I have two (or more but if solved for two, it's solved for any number) 2-by-N matrices which represent points with an x (the first row) and y (the second row) coordinates. The points are always sorted in the increasing x coordinate. What I want to do is I want to merge these two matrices into one 3-by-N matrix so that if two points (one from each matrix) have the same x coordinate, they would form one column in the new matrix, the first row being the x coordinate and the second and third row being the two y coordinates. However, if there is a point in one matrix that has x coordinate different than all other points in the second matrix, I still want to have full 3-element column that is placed such that the x coordinates are still sorted and the missing value from the other matrix is replaced by the nearest value with lower x coordinate (or NaN if there is none).
Better to explain by example.
First matrix:
1 3 5 7 % x coordinate
1 2 3 4 % y coordinate
Second matrix:
2 3 4 7 8 % x coordinate
5 6 7 8 9 % y coordinate
Desired result:
1 2 3 4 5 7 8 % x coordinate
1 1 2 2 3 4 4 % y coordinate from first matrix
NaN 5 6 7 7 8 9 % y coordinate from second matrix
My question is, how can I do it effectively in matlab/octave and numpy? (Effectively because I can always do it "manually" with loops but that doesn't seem right.)
Upvotes: 1
Views: 91
Reputation: 30175
Your example:
a = [1 3 5 7; 1 2 3 4];
b = [2 3 4 7 8; 5 6 7 8 9];
% Get the combined (unique, sorted) `x` coordinates
output(1,:) = unique([a(1,:), b(1,:)]);
% Initialise y values to NaN
output(2:3, :) = NaN;
% Add x coords from `a` and `b`
output(2, ismember(output(1,:),a(1,:))) = a(2,:);
output(3, ismember(output(1,:),b(1,:))) = b(2,:);
% Replace NaNs in columns `2:end` with the previous value.
% A simple loop has the advantage of capturing multiple consecutive NaNs.
for ii = 2:size(output,2)
colNaN = isnan(output(:, ii));
output(colNaN, ii) = output(colNaN, ii-1);
end
If you have more than 2 matrices (as suggested in your question) then I'd advise
ismember
, instead of having one code line per matrix hardcoded. This is the generic solution for any number of matrices, demonstrated with a
and b
:
mats = {a, b};
cmats = horzcat(mats);
output(1, :) = unique(cmats(1,:));
output(2:numel(mats)+1, :) = NaN;
for ii = 1:size(mats)
output(ii+1, ismember(output(1,:), mats{ii}(1,:))) = mats{ii}(2,:);
end
for ii = 2:size(output,2)
colNaN = isnan(output(:,ii));
output(colNaN, ii) = output(colNaN, ii-1);
end
Upvotes: 1
Reputation: 3052
You can do it with interp1
and the keyword 'previous'
for strategy (you can also choose 'nearest'
if you do not care if it is larger or smaller) and 'extrap'
for allowing extrapolation.
Define the matrices
a=[...
1 3 5 7;...
1 2 3 4];
b=[...
2 3 4 7 8;...
5 6 7 8 9];
Then find the interpolation points
x = unique([a(1,:),b(1,:)]);
And interpolate
[x ; interp1(a(1,:),a(2,:),x,'previous','extrap') ; interp1(b(1,:),b(2,:),x,'previous','extrap') ]
Timeit results:
I tested the algorithms on
n = 1e6;
a = cumsum(randi(3,2,n),2);
b = cumsum(randi(2,2,n),2);
and got:
Upvotes: 5
Reputation: 11638
This verions uses set operations:
a=[...
1 3 5 7;...
1 2 3 4];
b=[...
2 3 4 7 8;...
5 6 7 8 9];
% compute union of x coordinates
c = union(a(1,:),b(1,:));
% find indices of x of a and b coordinates in c
[~,~,ia] = intersect(a(1,:),c);
[~,~,ib] = intersect(b(1,:),c);
% create output matrix
d = NaN(3,numel(c));
d(1,:) = c;
d(2,ia) = a(2,:);
d(3,ib) = b(2,:);
% fill NaNs
m = isnan(d);
m(:,1) = false;
i = find(m(:,[2:end,1])); %if you have multiple consecutive nans you have to repeat these two steps
d(m) = d(i);
disp(d);
Upvotes: 2