SriniShine
SriniShine

Reputation: 1139

R: How to add the noise cluster into DBSCAN plot

I'm trying to plot DBSCAN results. This is what I have done so far. My distance matrix is here.

dbs55_CR_EUCL = dbscan(writeCRToMatrix,eps=0.006, MinPts = 4, method = "dist")

plot(writeCRToMatrix[dbs55_CR_EUCL$cluster>0,], 
     col=dbs55_CR_EUCL$cluster[dbs55_CR_EUCL$cluster>0],
     main="DBSCAN Clustering K = 4 \n (EPS=0.006, MinPts=4) without noise",
     pch = 20)

This is the plot: enter image description here

When I tried plotting all the clusters including the noise cluster I could only see 2 points in my plot. enter image description here

What I'm looking for are

  1. To add the points in the noise cluster to the plot but with a different symbol. Something similar to the following picture

enter image description here

  1. Shade the cluster areas like in the following picture

enter image description here

Upvotes: 0

Views: 1385

Answers (1)

Michael Hahsler
Michael Hahsler

Reputation: 3075

Noise clusters have an id of 0. R plots usually ignore a color of 0 so if you want to show the noise points (as black) then you need to do the following:

plot(writeCRToMatrix, 
  col=dbs55_CR_EUCL$cluster+1L,
  main="DBSCAN Clustering K = 4 \n (EPS=0.006, MinPts=4) with noise",
  pch = 20)

If you want a different symbol for noise then you could do the following (adapted from the man page):

library(dbscan)
n <- 100
x <- cbind(
     x = runif(10, 0, 10) + rnorm(n, sd = 0.2),
     y = runif(10, 0, 10) + rnorm(n, sd = 0.2)
)

res <- dbscan::dbscan(x, eps = .2, minPts = 4)
plot(x, col=res$cluster, pch = 20)
points(x[res$cluster == 0L], col = "grey", pch = "+")

Here is code that will create a shaded convex hull for each cluster

library(ggplot2)
library(data.table)
library(dbscan)


dt <- data.table(x, level=as.factor(res$cluster), key = "level")
hulls <- dt[, .SD[chull(x, y)], by = level]

### get rid of hull for noise
hulls <- hulls[level != "0",]

cols <- c("0" = "grey", "1" = "red", "2" = "blue")

ggplot(dt, aes(x=x, y=y, color=level)) +
  geom_point() +
  geom_polygon(data = hulls, aes(fill = level, group = level),
    alpha = 0.2, color = NA) +
  scale_color_manual(values = cols) +
  scale_fill_manual(values = cols)

Hope this helps.

Upvotes: 2

Related Questions