Create dendrogram using an index (proportion data) as grouping variable, R

Question

I want to create a dendrogram using an index (proportion data) that will show similar clusters. I am trying to decide what distance/similarity metric I have to use so that they represent the original index values.

I have a data frame that looks like this:

 data<-read.table(text="ind  index
T1  0.10
T2  0.11
                 T3 0.01
                 T4 0.64
                 T5 0.03
                 T6 0.15
                 T7 0.26
                 T8 0.06
                 T9 0.01
                 T10    0.004
                 T11    0.01
                 T12    0.19
                 T13    0.04
                 T14    0.69
                 T15    0.06
                 T16    0.51
                 T17    0.15
                 T18    0.26
                 T19    0.26
                 T20    0.01
                 ",header=T)

head(data)

data2<-as.matrix(data[,2])

d<-dist(data2)

# prepare hierarchical cluster
hc = hclust(d)
# very simple dendrogram
plot(hc)

This will produce a simple dendrogram. However, I actually want to use the values from the index column as "my distance". Any suggestions are welcome. Thanks in advance!

lawyeR · Accepted Answer

Perhaps this will help? Your values are on the y-axis.

hc <- hclust(d = d, method="single", members=NULL)
library(ggdendro)
ggdendrogram(hc, theme_dendro=FALSE)

enter image description here

Create dendrogram using an index (proportion data) as grouping variable, R

Answers (2)

Related Questions