Reputation: 614
I am trying to do some grouping and am encountering this error.
Evaluation error: the 'height' component of 'tree' is not sorted (increasingly).
My input is:
library(stringdist)
name <- c("luke,abcdef","luke,abcdeh","luke,abcdeg")
a<-stringdistmatrix(name, method="jw")
clusts <- hclust(a, method="ward.D2")
But when I try to cut it, it gives me an error:
> cutree(clusts, h = 0.155)
Error in cutree(clusts, h = 0.155) :
the 'height' component of 'tree' is not sorted (increasingly)
But if I use
a<-stringdistmatrix(name, method="jw", p=0.05)
everything works fine.
I have looked for a solution and couldn't find one. What should I do, to prevent this from happening and keep it working?
I have also noticed, that if I have the same distance matrix, but generated by hand (so there is no distance parameter in the cluster.
Upvotes: 2
Views: 2329
Reputation: 18425
If you compare diff(clusts$height)
for these two examples, the first comes out as a tiny negative number, the second as exactly zero. So the problem is caused by binary-representation rounding differences in values that should be equal.
It should work if you round the heights after calculating clusts
...
clusts$height <- round(clusts$height, 6)
Upvotes: 5