Amir H. Jadidinejad
Amir H. Jadidinejad

Reputation: 191

R data clustering using a pre-defined distance/similarity matrix

I developed a new document similarity measure (A method which calculate the amount of similarity/distance between two documents). I'm going to know how well this measure?

Clustering is an application which is based on a distance/similarity measure. So, I decided to evaluate the effectiveness of the proposed measure in different data clustering algorithms.

I read about different clustering algorithms in R. Suppose I have a document collection D which contains n documents, organized in k clusters. I want to evaluate the application of my similarity/distance measure in a variety of clustering algorithms (partitional, hierarchical and topic-based). The problem is that all examples and tutorials start from a "data" matrix, but I have a distance/similarity matrix.

Would you please help me with some hints in R?

Upvotes: 0

Views: 2657

Answers (1)

Christopher Louden
Christopher Louden

Reputation: 7592

hclust() requires a dissimilarity structure that is a dist object. If you start with a numeric matrix, m, you can create a dist object like so:

d <- as.dist(m)

You can then perform hierarchical clustering using hclust() like so:

hclust(d)

Upvotes: 1

Related Questions