Reputation: 191
I developed a new document similarity measure (A method which calculate the amount of similarity/distance between two documents). I'm going to know how well this measure?
Clustering is an application which is based on a distance/similarity measure. So, I decided to evaluate the effectiveness of the proposed measure in different data clustering algorithms.
I read about different clustering algorithms in R. Suppose I have a document collection D which contains n documents, organized in k clusters. I want to evaluate the application of my similarity/distance measure in a variety of clustering algorithms (partitional, hierarchical and topic-based). The problem is that all examples and tutorials start from a "data" matrix, but I have a distance/similarity matrix.
Would you please help me with some hints in R?
Upvotes: 0
Views: 2657
Reputation: 7592
hclust()
requires a dissimilarity structure that is a dist
object. If you start with a numeric matrix, m
, you can create a dist
object like so:
d <- as.dist(m)
You can then perform hierarchical clustering using hclust()
like so:
hclust(d)
Upvotes: 1