Ravonrip
Ravonrip

Reputation: 614

The 'height' component of 'tree' is not sorted Error in cutree

I am trying to do some grouping and am encountering this error.

Evaluation error: the 'height' component of 'tree' is not sorted (increasingly).

My input is:

library(stringdist)
name <- c("luke,abcdef","luke,abcdeh","luke,abcdeg")
a<-stringdistmatrix(name, method="jw")
clusts <- hclust(a, method="ward.D2")

But when I try to cut it, it gives me an error:

> cutree(clusts, h = 0.155)
Error in cutree(clusts, h = 0.155) : 
  the 'height' component of 'tree' is not sorted (increasingly)

But if I use

a<-stringdistmatrix(name, method="jw", p=0.05)

everything works fine.

I have looked for a solution and couldn't find one. What should I do, to prevent this from happening and keep it working?

I have also noticed, that if I have the same distance matrix, but generated by hand (so there is no distance parameter in the cluster.

Upvotes: 2

Views: 2329

Answers (1)

Andrew Gustar
Andrew Gustar

Reputation: 18425

If you compare diff(clusts$height) for these two examples, the first comes out as a tiny negative number, the second as exactly zero. So the problem is caused by binary-representation rounding differences in values that should be equal.

It should work if you round the heights after calculating clusts...

clusts$height <- round(clusts$height, 6) 

Upvotes: 5

Related Questions