Eghbal
Eghbal

Reputation: 3783

Draw (and process) graph by distances between points data (MATLAB)

I have an array like this:

1003    1007    0.0140589588522423
1059    1185    0.0336635172202602
1003    1093    0.0403056531253910
1003    1111    0.0417787840566580
1059    1127    0.0437157438475326
1082    1092    0.0532154519263457
1076    1185    0.0584688899071887
1003    1129    0.0585907987209575
1003    1045    0.0626826958352425
1003    1070    0.0660757861128676
1003    1014    0.0662929607751338

First column is point one name, second column is point two name, and third column is distance between every two points in the range of [0, 1]. If we have higher value in third column, we have greater distance and if we have lower value in third column, we have smaller distance between mentioned points. I have this data for more than 20,000 points. Now I want a pattern (graph) and more information to have a better understanding of distances or hypothetical positions of points. For example I want cluster together near points and again cluster near clusters to a bigger cluster. How can I do this using MATLAB? I have these data for pairs of all points.

Upvotes: 0

Views: 202

Answers (1)

Brendan Frick
Brendan Frick

Reputation: 1025

One of the easier solutions for an near-exact configuration is to use cmdscale() (https://www.mathworks.com/help/stats/cmdscale.html) to find a potential configuration given your distance constraints.

Given a vector of distances between points, cmdscale() will return an n*p matrix of n points in p dimensions, where p is minimized.

You will have to reorganize your data into an n*n matrix of distances between each point, but to get any decent graphical representation of this type of data, that would have to be done anyway.

distance = [0.0 0.1 0.2;
            0.1 0.0 0.3;
            0.2 0.3 0.0];
Y = cmdscale(distance); 
plot(ones(3,1),Y,'o'); % In this case my solution is 1 dimensional

output

For a 20,000 data points you will certainly need a higher order dimensionality (hopefully your data is already constrained to 2D or 3D). If it's not constrained to 3 dimensions the configuration matrix for Y will have to be reduced (see https://www.mathworks.com/help/stats/cmdscale.html for help) and you will lose some accuracy (but that's true of any exact representation in scaled-dimensional data).

If you are interested in a more probabilistic representation I have found much more success in network engines that utilize dynamic link physics/gravity to pull together nodes based on weights, but I have not seen anything that dynamic built in MATLAB.

Upvotes: 1

Related Questions