Krellex
Krellex

Reputation: 753

Plotting data from a csv file

I've done a manual K-means calculation, performing 3 iterations. I am looking to plot my cluster centroids and the relevant points I have clustered after the third iteration. Each point has an associated "E" value of either 1 or 2. If the point is E1 then the plot should use a * and if the point is of type E2 then it should plot that point with +. I am not sure how I could go about doing this.

Full data Data

Third iteration Centroids:

Centroid1(2, 3.5)

Centroid2(6.2, 8.8)

Centroid3(8.8, 2.4)

Clustered points after 3 iterations:

Cluster1: (1,1), (1,6), (2,1), (4,6)

Cluster2: (3,9), (3,10), (5,6), (8,9), (9,9), (9,10)

Cluster3: (7,2), (8,1), (9,1), (10,3), (10,5)

Currently I have managed to load in the csv file and remove the Sample column

data <- read.csv("data.csv")

data2 <- data[, -c(1)]

Upvotes: 0

Views: 67

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 174506

You could do:

data2$E <- ifelse(is.na(data2$E1), data2$E2, data2$E1)

library(ggplot2)

ggplot(data2, aes(X, Y, shape = factor(E))) +
  geom_point(size = 4) +
  scale_shape_manual(values = c(8, 3), name = "E") +
  theme_bw()

enter image description here

Upvotes: 1

Related Questions