Reputation:
I have the following custom made NxN distance matrix in numpy/scipy:
dist_matrix = array([array([5, 4, 2, 3, 2, 3]),
array([4, 5, 2, 3, 2, 2]),
array([2, 2, 5, 2, 2, 1]),
array([3, 3, 2, 5, 4, 2]),
array([2, 2, 2, 4, 5, 1]),
array([3, 2, 1, 2, 1, 5])])
how can I use this matrix to do hierarchical clustering and plot dendrograms in R / ggplot2? If I try to feed this distance matrix into R via rpy2 as:
r.hclust(dist_matrix)
I get the error:
res = super(Function, self).__call__(*new_args, **new_kwargs)
rpy2.rinterface.RRuntimeError: Error in if (is.na(n) || n > 65536L) stop("size cannot be NA nor exceed 65536") :
missing value where TRUE/FALSE needed
Upvotes: 1
Views: 740
Reputation: 11543
The R function hclust()
is taking "distance" objects:
from rpy2.robjects.packages import importr
stats = importr("stats")
d = stats.as_dist(m)
hc = r.hclust(d)
[note: the error message is also hinting at a possible conversion bug in rpy2. Can you file a bug report ? Thanks]
Upvotes: 1