Reputation: 12406
I am trying to extract the Validation Measures from an R clustering validation object created using clValid
.
When I create the object and print the full summary, I use the following
library(clValid)
x <- clValid(iris[, -5], nClust=2:10,
clMethods=c('hierarchical'), validation='internal')
summary(x)
The output of this is:
Clustering Methods:
hierarchical
Cluster sizes:
2 3 4 5 6 7 8 9 10
Validation Measures:
2 3 4 5 6 7 8 9 10
hierarchical Connectivity 0.0000 4.4770 8.9929 15.4893 18.4183 24.8464 29.8425 36.8567 39.5607
Dunn 0.3389 0.1378 0.1540 0.1540 0.1668 0.1624 0.1624 0.1915 0.1915
Silhouette 0.6867 0.5542 0.4720 0.4307 0.3420 0.3707 0.3659 0.3167 0.3083
Optimal Scores:
Score Method Clusters
Connectivity 0.0000 hierarchical 2
Dunn 0.3389 hierarchical 2
Silhouette 0.6867 hierarchical 2
Required output
I am trying to get the Validation Measures
as a dataframe like this:
2 3 4 5 6 7 8 9 10
hierarchical Connectivity 0.0000 4.4770 8.9929 15.4893 18.4183 24.8464 29.8425 36.8567 39.5607
Dunn 0.3389 0.1378 0.1540 0.1540 0.1668 0.1624 0.1624 0.1915 0.1915
Silhouette 0.6867 0.5542 0.4720 0.4307 0.3420 0.3707 0.3659 0.3167 0.3083
Attempt
When I use:
names(summary(x))
attributes(summary(x))
these both give
NULL
I can get the Optimal Scores using optimalScores(x)
, however, this does not work with validationMeasures(x)
.
Question
Is there a way to extract the Validation Measures
as a data.frame
from this summary object?
Upvotes: 1
Views: 527
Reputation: 76402
First of all, you should always try
str(x)
Formal class 'clValid' [package "clValid"] with 14 slots
..@ clusterObjs:List of 1
.. ..$ hierarchical:List of 7
.. .. ..$ merge : int [1:149, 1:2] -102 -8 -1 -10 -129 -11 -5 -20 -30 -58 ...
.. .. ..$ height : num [1:149] 0 0.1 0.1 0.1 0.1 ...
.. .. ..$ order : int [1:150] 42 15 16 33 34 37 21 32 44 24 ...
.. .. ..$ labels : NULL
.. .. ..$ method : chr "average"
.. .. ..$ call : language hclust(d = Dist, method = method)
.. .. ..$ dist.method: chr "euclidean"
.. .. ..- attr(*, "class")= chr "hclust"
..@ measures : num [1:3, 1:9, 1] 0 0.339 0.687 4.477 0.138 ...
.. ..- attr(*, "dimnames")=List of 3
.. .. ..$ : chr [1:3] "Connectivity" "Dunn" "Silhouette"
.. .. ..$ : chr [1:9] "2" "3" "4" "5" ...
.. .. ..$ : chr "hierarchical"
..@ measNames : chr [1:3] "Connectivity" "Dunn" "Silhouette"
..@ clMethods : chr "hierarchical"
..@ labels : chr [1:150] "1" "2" "3" "4" ...
..@ nClust : num [1:9] 2 3 4 5 6 7 8 9 10
..@ validation : chr "internal"
..@ metric : chr "euclidean"
..@ method : chr "average"
..@ neighbSize : num 10
..@ annotation : NULL
..@ GOcategory : chr "all"
..@ goTermFreq : num 0.05
..@ call : language clValid(obj = iris[, -5], nClust = 2:10, clMethods = c("hierarchical"), validation = "internal")
So we can see that this package uses and returns S4
objects, and that one of the slots, measures
, seems to be the one you want.
x@measures[,,"hierarchical"]
2 3 4 5 6 7
Connectivity 0.0000000 4.4769841 8.9928571 15.4892857 18.4182540 24.8464286
Dunn 0.3389087 0.1378257 0.1540416 0.1540416 0.1668323 0.1624158
Silhouette 0.6867351 0.5541609 0.4719936 0.4306700 0.3419904 0.3707424
8 9 10
Connectivity 29.8424603 36.8567460 39.5607143
Dunn 0.1624158 0.1914854 0.1914854
Silhouette 0.3658753 0.3166807 0.3082851
Upvotes: 4