Jacky Lee
Jacky Lee

Reputation: 1233

How to generate this shape in Matlab?

In matlab, how to generate two clusters of random points like the following graph. Can you show me the scripts/code?

Target shape

Upvotes: 3

Views: 2744

Answers (3)

Egon
Egon

Reputation: 4787

If you want to generate such data points, you will need to have their probability distribution to be able to generate the points.

For your point, I do not have the real distributions, so I can only give an approximation. From your figure I see that both lay approximately on a circle, with a random radius and a limited span for the angle. I assume those angles and radii are uniformly distributed over certain ranges, which seems like a pretty good starting point.

Therefore it also makes sense to generate the random data in polar coordinates (i.e. angle and radius) instead of the cartesian ones (i.e. horizontal and vertical), and transform them to allow plotting.

C1 = [0 0];   % center of the circle
C2 = [-5 7.5];
R1 = [8 10];  % range of radii
R2 = [8 10];
A1 = [1 3]*pi/2; % [rad] range of allowed angles
A2 = [-1 1]*pi/2;

nPoints = 500;

urand = @(nPoints,limits)(limits(1) + rand(nPoints,1)*diff(limits));
randomCircle = @(n,r,a)(pol2cart(urand(n,a),urand(n,r)));

[P1x,P1y] = randomCircle(nPoints,R1,A1);
P1x = P1x + C1(1);
P1y = P1y + C1(2);

[P2x,P2y] = randomCircle(nPoints,R2,A2);
P2x = P2x + C2(1);
P2y = P2y + C2(2);

figure
plot(P1x,P1y,'or'); hold on;
plot(P2x,P2y,'sb'); hold on;
axis square

This yields:

Outcome

This method works relatively well when you deal with distributions that you can transform easily and when you can easily describe the possible locations of the points. If you cannot, there are other methods such as the inverse transforming sampling method which offer algorithms to generate the data instead of manual variable transformations as I did here.

Upvotes: 6

user85109
user85109

Reputation:

Assuming that you really want to do the clustering operation on existing data, as opposed to generating the data itself. Since you have a plot of some data, it seems logical that you already know how to do that! If I am wrong in this assumption, then you should word your questions more carefully in the future.

The human brain is quite good at seeing patterns in things like this, that writing a code for on a computer will often take some serious effort.

As has been said already, traditional clustering tools such as k-means will fail. Luckily, the image processing toolbox has good tools for these purposes already written. I might suggest converting the plot into an image, using filled in dots to plot the points. Make sure the dots are large enough that they touch each other within a cluster, with some overlap. Then use dilation/erosion tools if necessary to make sure that any small cracks are filled in, but don't go so far as to cause the clusters to merge. Finally, use region segmentation tools to pick out the clusters. Once done, transform back from pixel units in the image into your spatial units, and you have accomplished your task.

For the image processing approach to work, you will need sufficient separation between the clusters compared to the coarseness within a cluster. But that seems obvious for any method to succeed.

Upvotes: 2

nsanders
nsanders

Reputation: 12646

K-means is not going to give you what you want.

For K-means, vectors are classified based on their nearest cluster center. I can only think of two ways you could get the non-convex assignment shown in the picture:

  • Your input data is actually higher-dimensional, and your sample image is just a 2-d projection.
  • You're using a distance metric with different scaling across the dimensions.

To achieve your aim:

  • Use a non-linear clustering algorithm.
  • Apply a non-linear transform to your input data. (Probably not feasible).

You can find a list on non-linear clustering algorithms here. Specifically, look at this reference on the MST clustering page. Your exact shape appears on the fourth page of the PDF together with a comparison of what happens with K-Means.

For existing MATLAB code, you could try this Kernel K-Means implementation. Also, check out the Clustering Toolbox.

Upvotes: 2

Related Questions