Hashim
Hashim

Reputation: 327

Plot K-Means clustering

I would like to plot K-Means clustering having two clusters in total with two different colors, the sample of which is show below.

x
            Name Cluster
1           A2M       1
2          AAAS       1
3          AACS       1
4         AAGAB       1
5          AAK1       1
6          AAMP       1
7          AARS       1
8         AARS2       1
9        AARSD1       1
10        AASDH       1
11     AASDHPPT       1
12         AASS       1
13         AATF       1
14         ABAT       1
15        ABCA1       1
16      ABCA11P       1
17        ABCA3       1
18        ABCA5       1
19       ABCB10       1
20        ABCB6       1
21        ABCB7       1
22        ABCB8       1
23        ABCC1       1
24       ABCC10       1
25        ABCC4       1
26        ABCC5       1
27        ABCD3       1
28        ABCD4       1
29        ABCE1       1
30        ABCF1       1
31        ABCF2       1
32        ABCF3       1
33        ABCG1       1
34       ABHD10       1
35       ABHD11       1
36       ABHD12       1
37       ABHD13       1
38      ABHD14A       1
39      ABHD14B       1
40        ABHD2       1
20286    ZNF749       2
20287     ZNF76       2
20288   ZNF804A       2
20289   ZNF804B       2
20290    ZNF835       2
20291    ZNF852       2
20292   ZNF861P       2
20293    ZNF865       2
20294   ZNF876P       2
20295     ZNF99       2
20296     ZNRF4       2
20297       ZP1       2
20298       ZP2       2
20299     ZPBP2       2
20300    ZSCAN1       2
20301   ZSCAN10       2
20302 ZSCAN12P1       2
20303    ZSWIM2       2
20304    ZSWIM4       2
20305      tAKR       2

The dataframe x has two clusters of size 15206 and 5099. I tried the code

library(ggplot2)
ggplot(x, aes(x$Name, x$Cluster, color = x$Cluster)) + geom_point()

Got the error:

Error in UseMethod("depth") : no applicable method for 'depth' applied to an object of class "NULL"

Upvotes: 0

Views: 408

Answers (1)

Roman Luštrik
Roman Luštrik

Reputation: 70643

Please revise your code. There is no need to refer to variables in your object.

xy <- data.frame(Name = LETTERS, Cluster = sample(1:3, size = 26, replace = TRUE))

library(ggplot2)

ggplot(xy, aes(x = Name, y = Cluster, color = as.factor(Cluster))) +
  geom_point()

enter image description here

In case you see some overlap, you can always jitter the result.

ggplot(xy, aes(x = Name, y = Cluster, color = as.factor(Cluster))) +
  geom_jitter(height = 0.05)

enter image description here

Upvotes: 2

Related Questions