Reputation: 21
I've only been using R for a short time. I use R 3.4.4. For a study I created clusters as follows:
library('cluster')
clusterward2 <- agnes(dist.om2, diss = TRUE, method = "ward")
plot(clusterward2)
plot(sort(clusterward2$height, decreasing=TRUE)[1:15], type='s', xlab="nb de classes", ylab="inertie")
points(c(2, 3,5), sort(clusterward2$height, decreasing=TRUE)[c(2, 3,5)],
col = c("green3", "red3", "blue3"), cex = 2, lwd = 4)
cl2.3 <- cutree(clusterward2, k = 3)
Then, i retrieve the individual belonging to each cluster as follow :
split(mydata$colonneID, cl2.3)
Is it possible to retrieve the score that each individual had for clustering ? I would like to analyse the extreme of each class, but I don't know if it is possible and how do it.
My data :
donnees <- "CT H_NH I1 I2 I3 I4
CT_5 humain SN_def SN_dem SN_dem Pro
CT_6 humain SN_def SN_ind SN_def SN_dem
CT_7 humain SN_def SN_dem SN_pos SN_dem
CT_8 humain SN_def Autre SN_def SN_def
CT_9 humain Autre SN_def SN_def SN_def
CT_15 humain SN_ind SN_def SN_def SN_def
CT_17 humain Autre SN_pos SN_dem Autre
CT_18 humain SN_def Pro SN_def Pro
CT_19 humain SN_def Pro SN_def SN_pos
CT_20 humain SN_def SN_def Pro SN_pos
CT_27 humain NPP Pro Pro Pro
CT_29 humain NPP SN_sansDET NPP SN_pos
CT_30 humain SN_sansDET Pro SN_def Pro
CT_32 humain SN_def SN_def SN_def SN_dem
CT_33 humain Autre NPP NPP SN_def
CT_34 humain NPP Pro NPP Pro
CT_35 humain SN_def NPP Pro NPP"
Data <- read.table(text=donnees, header = TRUE)
The code i used :
Data.lab <- seqstatl(Data[,3:6])
Data.scode <- c("Autre", "NPP", "Pro", "SN_def", "SN_dem", "SN_ind", "SN_pos", "SN_sansDET")
Data.seq_7 <- seqdef(Data[, 3:6], states = Data.scode)
submat2 <- seqsubm(Data.seq_7, method = "TRATE")
dist.om2 <- seqdist(Data.seq_7, method = "OM", indel = 1, sm = submat2)
library('cluster')
clusterward2 <- agnes(dist.om2, diss = TRUE, method = "ward")
plot(clusterward2, which.plots=2)
plot(sort(clusterward2$height, decreasing=TRUE)[1:15], type='s', xlab="nb de classes", ylab="inertie")
points(c(2, 3,5), sort(clusterward2$height, decreasing=TRUE)[c(2, 3,5)],
col = c("green3", "red3", "blue3"), cex = 2, lwd = 4)
cl2.3 <- cutree(clusterward2, k = 3)
Thank you very much for your help.
Upvotes: 2
Views: 137
Reputation: 3669
You get the distances to the cluster centers with the disscenter
function of TraMineR
diss.to.cl.center <- disscenter(dist.om2, group=cl2.3)
res <- data.frame(seqconc(Data.seq_7), cl2.3, diss.to.cl.center)
res
# Sequence cl2.3 diss.to.cl.center
# 1 SN_def-SN_dem-SN_dem-Pro 1 0.9947090
# 2 SN_def-SN_ind-SN_def-SN_dem 2 1.3511905
# 3 SN_def-SN_dem-SN_pos-SN_dem 1 0.7447090
# 4 SN_def-Autre-SN_def-SN_def 2 1.3167989
# 5 Autre-SN_def-SN_def-SN_def 2 1.3167989
# 6 SN_ind-SN_def-SN_def-SN_def 2 1.3167989
# 7 Autre-SN_pos-SN_dem-Autre 1 1.4788360
# 8 SN_def-Pro-SN_def-Pro 2 1.2096561
# 9 SN_def-Pro-SN_def-SN_pos 2 1.1957672
# 10 SN_def-SN_def-Pro-SN_pos 2 1.1554233
# 11 NPP-Pro-Pro-Pro 3 1.4152381
# 12 NPP-SN_sansDET-NPP-SN_pos 3 1.7533333
# 13 SN_sansDET-Pro-SN_def-Pro 2 2.3842593
# 14 SN_def-SN_def-SN_def-SN_dem 2 0.7890212
# 15 Autre-NPP-NPP-SN_def 3 1.7485714
# 16 NPP-Pro-NPP-Pro 3 0.6652381
# 17 SN_def-NPP-Pro-NPP 3 1.1033333
Upvotes: 1