Reputation: 3783
I have an array like this:
1003 1007 0.0140589588522423
1059 1185 0.0336635172202602
1003 1093 0.0403056531253910
1003 1111 0.0417787840566580
1059 1127 0.0437157438475326
1082 1092 0.0532154519263457
1076 1185 0.0584688899071887
1003 1129 0.0585907987209575
1003 1045 0.0626826958352425
1003 1070 0.0660757861128676
1003 1014 0.0662929607751338
First column is point one name, second column is point two name, and third column is distance between every two points in the range of [0, 1]
. If we have higher value in third column, we have greater distance and if we have lower value in third column, we have smaller distance between mentioned points. I have this data for more than 20,000 points. Now I want a pattern (graph) and more information to have a better understanding of distances or hypothetical positions of points. For example I want cluster together near points and again cluster near clusters to a bigger cluster. How can I do this using MATLAB? I have these data for pairs of all points.
Upvotes: 0
Views: 202
Reputation: 1025
One of the easier solutions for an near-exact configuration is to use cmdscale()
(https://www.mathworks.com/help/stats/cmdscale.html) to find a potential configuration given your distance constraints.
Given a vector of distances between points, cmdscale()
will return an n*p
matrix of n
points in p
dimensions, where p
is minimized.
You will have to reorganize your data into an n*n
matrix of distances between each point, but to get any decent graphical representation of this type of data, that would have to be done anyway.
distance = [0.0 0.1 0.2;
0.1 0.0 0.3;
0.2 0.3 0.0];
Y = cmdscale(distance);
plot(ones(3,1),Y,'o'); % In this case my solution is 1 dimensional
For a 20,000 data points you will certainly need a higher order dimensionality (hopefully your data is already constrained to 2D or 3D). If it's not constrained to 3 dimensions the configuration matrix for Y
will have to be reduced (see https://www.mathworks.com/help/stats/cmdscale.html for help) and you will lose some accuracy (but that's true of any exact representation in scaled-dimensional data).
If you are interested in a more probabilistic representation I have found much more success in network engines that utilize dynamic link physics/gravity to pull together nodes based on weights, but I have not seen anything that dynamic built in MATLAB.
Upvotes: 1