stats_noob
stats_noob

Reputation: 5925

R: Superimpose Clusters on top of a Graph

I am using the R programming language. I created some data and make a KNN graph of this data. Then I performed clustering on this graph. Now, I want to superimpose the clusters on top of the graph.

Here is an example I made up (source: https://michael.hahsler.net/SMU/EMIS8331/material/jpclust.html) - suppose we have a dataset with 3 variables : the longitude of the house, the latitude of the house and the price of the house (we "scale" all these variables since the "price" and the "long/lat" are in different units). We can then make a KNN graph (using R software):

library(dbscan)

plot_nn <- function(x, nn, ...) {
  plot(x, ...)
  for(i in 1:nrow(nn$id))
    for(j in 1:length(nn$id[i,]))
      lines(x = c(x[i,1], x[nn$id[i,j],1]), y = c(x[i,2], x[nn$id[i,j],2]))

}

Lat = round(runif(500,43,44), 4)
Long = round(runif(500,79,80), 4)
price = rnorm(500,1000000,200)
b = data.frame(Lat, Long, price)
b = scale(b)

b = as.matrix(b)
nn <- kNN(b, k = 10, sort = FALSE)
plot_nn(b, nn, col = "grey")

enter image description here

Now, I perform the clustering algorithm:

x = b

JP_R <- function(x, k, kt) {
  # Step 1
  nn <- kNN(x, k, sort = FALSE)$id
  n <- nrow(nn)

  # Step 2
  labels <- 1:n

  # Step 3
  for(i in 1:n) {
    # check all neighbors of i
    for(j in nn[i,]) {
      if(j<i) next ### we already checked this edge
      if(labels[i] == labels[j]) next ### already in the same cluster
      if(i %in% nn[j,] && length(intersect(nn[i,], nn[j,]))+1L >= kt) {
        labels[labels == max(labels[i], labels[j])] <- min(labels[i], labels[j])
      }
    }
  }

  # Step 4: create contiguous labels
  as.integer(factor(labels))
}


cl <- JP_R(x, k = 10, kt = 6)

I can make a basic plot of this clustering algorithm:

plot(x, col = cl)

enter image description here

But is there a way to show these clusters on the first image instead?

Something like this?

enter image description here

Thanks

Upvotes: 1

Views: 154

Answers (1)

ThomasIsCoding
ThomasIsCoding

Reputation: 102469

You can use either

points(x, col = cl)

or

par(new = TRUE)
plot(x, col = cl)

enter image description here

Upvotes: 1

Related Questions