Clustering points based on their linear proximity

Question

I have data that I want to cluster into two groups based on their linear proximity (i.e., points that are almost collinear gets to be grouped together). Here is a sample of my data:

data <- data.frame(Y=c(seq(0,10,1), seq(0,4,0.5)), X= c(0:10,0:8))
plot(jitter(data$Y), jitter(data$X), pch=19)

enter image description here

The result that I want to get is something like this: enter image description here

Obviously, the regular (hierarchical or K-means) clustering didn't work. Furthermore, I tried spectral clustering also did't provide good result.

Any suggestion on how to do this (using clustering, regression or other methods) is highly appreciated! Thanks

Rorschach · Accepted Answer

You could try the package mclust

## Add a little noise to the lines
data <- data.frame(Y=c(seq(0,10,1), seq(0,4,0.5))+rnorm(20,0,0.1), X= c(0:10,0:8))

library(mclust)
fit <- Mclust(data)
plot(fit)  # classification

enter image description here

Clustering points based on their linear proximity

Answers (2)

Transform your data.

Related Questions