user2261062
user2261062

Reputation:

Plot symbols depending on vector values

I have a dataset of points represented by a 2D vector (X).

Each point belongs to a categorical data (Y) represented by an integer value(from 1 to 4).

I want to plot each point with a different symbol depending on its class.

Toy example:

X = randi(100,10,2);   % 10 points ranging 1:100 in 2D space
Y = randi(4,10,1);     % class of the points (1 to 4)

I create a vector of symbols for each class:

S = {'bx' 'rx' 'b.' 'r.'};

Then I try:

plot(X(:,1), X(:,2), S(Y))


Error using plot
Invalid first data argument

How can I assign to each point of X a different symbol based on the value of Y?

Of curse I can use a loop for each class and plot the different classes one by one. But is there a method to directly plot each class with a different symbol?

Upvotes: 3

Views: 131

Answers (2)

EBH
EBH

Reputation: 10450

No need for a loop, use gscatter:

X = randi(100,10,2);   % 10 points ranging 1:100 in 2D space
Y = randi(4,10,1);     % class of the points (1 to 4)
color = 'brbr';
symbol = 'xx..';
gscatter(X(:,1),X(:,2),Y,color,symbol)

and you will get: group scatter

Upvotes: 3

Stewie Griffin
Stewie Griffin

Reputation: 14939

If X has many rows, but there are only a few S types, then I suggest you check out the second approach first. It's optimized for speed instead of readability. It's about twice as fast if the vector has 10 elements, and more than 200 times as fast if the vector has 1000 elements.


First approach (easy to read):

Regardless of approach, I think you need a loop for this:

hold on
arrayfun(@(n) plot(X(n,1), X(n,2), S{Y(n)}), 1:size(X,1))

Or, to write the loop in the "conventional way":

hold on
for n = 1:size(X,1)
   plot(X(n,1), X(n,2), S{Y(n)})
end

enter image description here

Second approach (gives same plot as above):

If your dataset is large, you can sort [Y_sorted, sort_idx] = sort(Y), then use sort_idx to index X, like this: X_sorted = X(sort_idx);. After this, you split X_sorted into 4 groups, one for each of the individual Y-values, using histc and mat2cell. Then you loop over the four groups and plot each one individually.

This way you only need to loop through four values, regardless of the number of elements in your data. This should be a lot faster if the number of elements is high.

[Y_sorted, Y_index] = sort(Y);
X_sorted = X(Y_index, :);
X_cell = mat2cell(X_sorted, histc(Y,1:numel(S)));

hold on
for ii = 1:numel(X_cell)
    plot(X_cell{ii}(:,1),X_cell{ii}(:,2),S{ii})
end

Benchmarking:

I did a very simple benchmarking of the two approaches using timeit. The result shows that the second approach is a lot faster:

For 10 elements:

First approach: 0.0086
Second approach: 0.0037

For 1000 elements:

First approach = 0.8409
Second approach = 0.0039

Upvotes: 2

Related Questions