Filly
Filly

Reputation: 733

Clustering points based on their linear proximity

I have data that I want to cluster into two groups based on their linear proximity (i.e., points that are almost collinear gets to be grouped together). Here is a sample of my data:

data <- data.frame(Y=c(seq(0,10,1), seq(0,4,0.5)), X= c(0:10,0:8))
plot(jitter(data$Y), jitter(data$X), pch=19) 

enter image description here

The result that I want to get is something like this: enter image description here

Obviously, the regular (hierarchical or K-means) clustering didn't work. Furthermore, I tried spectral clustering also did't provide good result.

Any suggestion on how to do this (using clustering, regression or other methods) is highly appreciated! Thanks

Upvotes: 2

Views: 401

Answers (2)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77454

Transform your data.

Try clustering/analyzing the variable z = x / y instead

Upvotes: 0

Rorschach
Rorschach

Reputation: 32426

You could try the package mclust

## Add a little noise to the lines
data <- data.frame(Y=c(seq(0,10,1), seq(0,4,0.5))+rnorm(20,0,0.1), X= c(0:10,0:8))

library(mclust)
fit <- Mclust(data)
plot(fit)  # classification

enter image description here

Upvotes: 3

Related Questions