Evgenij Reznik
Evgenij Reznik

Reputation: 18614

Hierarchical clustering with R

Consider several points:

A = (1, 2.5), B = (5, 10), C = (23, 34), D = (45, 47), E = (4, 17), F = (18, 4)

How can I perform hierarchical clustering on them with R?
I've read this example Cluster Analysis but I'm not sure how to enter these values as points rather than just regular numbers.

When I do

x <- c(...) #x values
y <- c(...) #y values

I can plot them using

plot(x,y)

But how can I specify those values like in the example:

mydata <- scale(mydata)

Doing

mydata <- scale(x,y)

I get the following error

Error in scale.default(x, y) : 
  length of 'center' must equal the number of columns of 'x'

Upvotes: 1

Views: 1239

Answers (1)

jlhoward
jlhoward

Reputation: 59415

Something like this??

A = c(1, 2.5); B = c(5, 10); C = c(23, 34)
D = c(45, 47); E = c(4, 17); F = c(18, 4)
df <- data.frame(rbind(A,B,C,D,E,F))
colnames(df) <- c("x","y")
hc <- hclust(dist(df))
plot(hc)

This puts the points into a data frame with two columns, x and y, then calculates the distance matrix (pairwise distance between every point and every other point), and does the hierarchical cluster analysis on that.

We can then plot the data with coloring by cluster.

df$cluster <- cutree(hc,k=2)    # identify 2 clusters
plot(y~x,df,col=cluster)

enter image description here

Upvotes: 3

Related Questions