Reputation: 360
ggg <- data.frame(row.names=c("a","b","c","d","e"),var1=c("0","0","0","0","0"),var2=c("1","1","1","1","2"))
ggg_dist <- as.matrix(ggg) %>% as.dist(.)
In as.dist.default(.) : non-square matrix
class(ggg_dist)
[1] "dist"
ggg_dist
Warning message:
In df[row(df) > col(df)] <- x :
number of items to replace is not a multiple of replacement length
h_ggg <- hclust(ggg_dist,method="average")
Fehler in hclust(ggg_dist, method = "average") :
'D' must have length (N \choose 2).
I want to perform hierarchical clustering with ggg
.
ggg_dist
is a distance as confirmed with class()
made out of ggg
. I want to do hierarchical clustering with ggg_dist
but this does not work. It shows above error. How can I solve that.
I tried that How to convert data.frame into distance matrix for hierarchical clustering? , but get the same error when I try to call ggg_dist
.
Upvotes: 0
Views: 1207
Reputation: 1595
as.dist()
requires a square matrix or data.frame. Your original object ggg
has 5 rows but just 2 columns.
Something like the following would work.
ggg <- data.frame(row.names = c("a", "b"),
var1 = c("0", "0"),
var2 = c("1", "1"))
ggg_dist <- as.dist(ggg)
h_ggg <- hclust(ggg_dist, method="average")
h_ggg
#>
#> Call:
#> hclust(d = ggg_dist, method = "average")
#>
#> Cluster method : average
#> Number of objects: 2
Created on 2020-05-27 by the reprex package (v0.3.0)
Upvotes: 0
Reputation: 21400
You can use the function dist
:
ggg_dist <- dist(ggg, method = "euclidian")
Result:
ggg_dist
a b c d
b 0
c 0 0
d 0 0 0
e 1 1 1 1
Upvotes: 1