Hierarchical Cluster Analysis in Cluster 3.0

Question

I'm new to this site as well as new to cluster analysis, so I apologize if I violate conventions.

I've been using Cluster 3.0 to perform Hierarchical Cluster Analysis with Euclidean Distance and Average linkage. Cluster 3.0 outputs a .gtr file with a node joining a gene and their similarity score. I've noticed that the first line in the .gtr file always links a gene with another gene followed by the similarity score. But, how do I reproduce this similarity score?

In my data set, I have 8 genes and create a distance matrix where d_{ij} contains the Euclidian distance between gene i and gene j. Then I normalize the matrix by dividing each element by the max value in the matrix. To get the similarity matrix, I subtract all the elements from 1. However, my result does not use the linkage type and differs from the output similarity score.

I am mainly confused how linkages affect the similarity of the first node (the joining of the two closest genes) and how to compute the similarity score.

Thank you!

Hierarchical Cluster Analysis in Cluster 3.0

Answers (1)

Related Questions