Reputation: 367
I have two matrices A(m X 3) and B(n X 3); where m >> n.
Numbers in B have close or equal values to numbers in A.
I want to search closest possible values from A to the values present in B in a way that at the end of search, A will reduced to (n X 3).
There are two main issues:
I hope I am clear in asking my question. Does anybody know about any function already present in matlab for this thing?
Upvotes: 1
Views: 1339
Reputation: 30579
Depending on how you look at the task, here are two different approaches
Two ways to look at this: (1) closest point in A
for each point in B
, or (2) closest point in B
for each point in A
.
Closest point in A
For each point in B
you can find the closest point in A
(e.g. Euclidean distance), as requested in comments:
% Calculate all MxN high-dimensional (3D space) distances at once
distances = squeeze(sum(bsxfun(@minus,B,permute(A,[3 2 1])).^2,2));
% Find closest row in A for each point in B
[~,ik] = min(distances,[],2)
Make an array the size of B
containing these closest points in A
:
Anew = A(ik,:)
This will implicitly throw out any points in A
that are too far from points in B
, as long as each point in B
does have a match in A
. If each point in B
does not necessarily have a "match" (point at an acceptable distance) in A
, then it is necessary to actively reject points based on distances
, resulting in an output that would be shorter than B
. This solution seems out of scope.
Closest point in B
Compute the Euclidenan distance from each point (row) in A
to each point in B
and identify the closest point in B
:
distances = squeeze(sum(bsxfun(@minus,A,permute(B,[3 2 1])).^2,2));
[~,ik] = min(distances,[],2)
Make an array the size of A
containing these closest points in B
:
Anew = B(ik,:)
The size of Anew
in this approach is the same as A
.
Another approach is to use the undocumented _mergesimpts
function.
Consider this test data:
>> B = randi(5,4,3)
B =
1 4 4
2 3 4
1 3 4
3 4 5
>> tol = 0.001;
>> A = repmat(B,3,1) + tol * rand(size(B,1)*3,3)
A =
1.0004 4.0005 4.0000
2.0004 3.0005 4.0008
1.0004 3.0009 4.0002
3.0008 4.0005 5.0004
1.0006 4.0004 4.0007
2.0008 3.0007 4.0004
1.0009 3.0007 4.0007
3.0010 4.0005 5.0004
1.0002 4.0003 4.0007
2.0001 3.0001 4.0007
1.0007 3.0006 4.0004
3.0001 4.0003 5.0000
Merge similar rows in A
according to a specified tolerance, tol
:
>> builtin('_mergesimpts',A,tol,'average')
ans =
1.0004 4.0004 4.0005
1.0007 3.0007 4.0005
2.0005 3.0005 4.0006
3.0006 4.0004 5.0003
Merge similar rows, using B
to get expected numbers
>> builtin('_mergesimpts',[A; B],tol,'first')
ans =
1 3 4
1 4 4
2 3 4
3 4 5
Upvotes: 4
Reputation: 112679
To replace each row of A
by the closest row of B
You can use pdist2
to compute distance between rows, and then the second output of min
to find the index of the minimum-distance row:
[~, ind] = min(pdist2(B,A,'euclidean')); %// or specify some other distance
result = B(ind,:);
The advantage of this approach is that pdist2
lets you specify other distance functions, or even define your own. For example, to use L1 distance change first line to
[~, ind] = min(pdist2(B,A,'cityblock'));
To retain rows of A
which are closest to rows of B
Use pdist2
as above. For each row of A
compute the minimum distance to rows of B
. Retain the n
rows of A
with lowest value of that minimum distance:
[~, ii] = sort(min(pdist2(B,A,'euclidean'))); %// or use some other distance
result = A(ii(1:n),:);
Upvotes: 1
Reputation: 221564
Try this code -
%% Create data m=7; n=4; TOL = 0.0005; A = rand(m,3)/100; B = rand(n,3)/100; B(2,:) = A(5,:); % For testing that the matching part of the second row from B must be the fifth row from A %% Interesting part B2 = repmat(reshape(B',1,3,n),[m 1]); closeness_matrix = abs(bsxfun(@minus, A, B2)); closeness_matrix(closeness_matrix<TOL)=0; closeness_matrix_mean = mean(closeness_matrix,2); % Assuming that the "closeness" for the triplets on each row can be measured by the mean value of them X1 = squeeze(closeness_matrix_mean); [minval,minind] = min(X1,[],1); close_indices = minind'; A_final = A(close_indices,:)
Upvotes: 0