Jeff
Jeff

Reputation: 8421

How to get labels from hclust result

let's say i have a dataset like this

dt<-data.frame(id=1:4,X=sample(4),Y=sample(4))

and then i try to make a hierarchical clustering using the below code

dis<-dist(dt[,-1])
clusters <- hclust(dis)
plot(clusters)

and it works well

enter image description here

The point is when i ask for

clusters$labels

it gives me NULL, meanwhile i expect to see the label of indivisuals in order like

1, 4, 2, 3

it is important to have them with the order that they are added in plot

Upvotes: 4

Views: 8740

Answers (3)

PKumar
PKumar

Reputation: 11128

Use cluster$order rather than labels if you happened to not have assigned the labels.

Infact you can see all the contents by using function called summary

clusters <- hclust(dis)
plot(clusters)
summary(clusters)
clusters$order

You can compare with the plot i received at my end, it is offcourse little different than yours

My outcome:

> clusters$order
[1] 4 1 2 3

Content of summary command:

> summary(clusters)
            Length Class  Mode     
merge       6      -none- numeric  
height      3      -none- numeric  
order       4      -none- numeric  
labels      0      -none- NULL     
method      1      -none- character
call        2      -none- call     
dist.method 1      -none- character

You can observe that since there is null value against labels, hence you are not getting the labels. To receive the labels you need to assign them first using clusters$labels <- c("A","B","C","D") or you can assign with the rownames, once your labels are assigned you will no longer see the numbers you will able to see the names/labels.

In my case I have not assigned any name hence receiving the numbers instead.

You can put the labels in the plot function itself as well.

From the documentation ?hclust

labels
A character vector of labels for the leaves of the tree. By default the row names or row numbers of the original data are used. If labels = FALSE no labels at all are plotted.

enter image description here

Upvotes: 2

KoenV
KoenV

Reputation: 4283

You could use the following code:

# your data, I changed the id to characters to make it more clear
set.seed(1234) # for reproducibility
dt<-data.frame(id=c("A", "B", "C", "D"),X=sample(4),Y=sample(4))
dt

# your code, no labels    
dis<-dist(dt[,-1])
clusters <- hclust(dis)
clusters$labels

# add labels, plot and check labels
clusters$labels <- dt$id
plot(clusters)

## labels in the order plotted
clusters$labels[clusters$order]
## [1] A D B C
## Levels: A B C D

Please let me know whether this is what you want.

Upvotes: 2

Jon
Jon

Reputation: 2567

Please make sure you use rownames(...) to ensure your data has labels

> rownames(dt) <- dt$id
> dt
id X Y
1  1 2 1
2  2 4 3
3  3 1 2
4  4 3 4
> dis<-dist(dt[,-1])
> clusters <- hclust(dis)
> str(clusters)
List of 7
$ merge      : int [1:3, 1:2] -1 -2 1 -3 -4 2
$ height     : num [1:3] 1.41 1.41 3.16
$ order      : int [1:4] 1 3 2 4
$ labels     : chr [1:4] "1" "2" "3" "4"
$ method     : chr "complete"
$ call       : language hclust(d = dis)
$ dist.method: chr "euclidean"
- attr(*, "class")= chr "hclust"
>

Upvotes: 0

Related Questions