Amirul Islam
Amirul Islam

Reputation: 457

How to add cluster id in a seperate column of a dataframe?

I have produced a dendogram with hclust and cut it into two clusters. I know from the graph which row corresponds to which cluster. What I want to do is create a separate column in the dataframe that will contain element "class-1" if the row corresponds to the first cluster and will contain the element "class-2" if corresponds the second cluster.

Upvotes: 0

Views: 1899

Answers (1)

Rui Barradas
Rui Barradas

Reputation: 76432

Without an example dataset, I will use the built-in USArrests.
If you create a column of class factor with the labels "class-1" and "class-2" R will automatically assign them to the values 1 and 2, respectively.

hc <- hclust(dist(USArrests), "ave")    # taken from the help page ?hclust
memb <- cutree(hc, k = 2)               #

res <- cbind(USArrests, Class = factor(unname(memb), labels = c("class-1", "class-2")))
head(res)
#           Murder Assault UrbanPop Rape   Class
#Alabama      13.2     236       58 21.2 class-1
#Alaska       10.0     263       48 44.5 class-1
#Arizona       8.1     294       80 31.0 class-1
#Arkansas      8.8     190       50 19.5 class-2
#California    9.0     276       91 40.6 class-1
#Colorado      7.9     204       78 38.7 class-2

Upvotes: 3

Related Questions