TheBumpper
TheBumpper

Reputation: 55

Clustering genes based on function

We would like to use either hierarchical or k means clustering, to cluster the genes in our dataset based on their function. We got the GO id for each gene and now we would like to cluster them in groups based on the function preferably hierarchical. That means from the bottom (where each function is unique) to upper levels (where we have more generalized/groups of functions). We are programming in R.

Thanks in advance for your help!

Upvotes: 0

Views: 1031

Answers (2)

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77454

k-means isn't a good idea for this kind of data.

Instead, look at algorithms specialized for this data, in particular biclustering algorithms.

Upvotes: 1

micans
micans

Reputation: 1106

Usuall one either performs a differential expression analysis between two conditions, or clusters genes based on expression across conditions or time points. After that, it is possible to look for overrepresentation of GO terms in differentially expressed gene sets or in clusters.

You may be interested in GeneMania (http://www.genemania.org/) - you can enter a list of genes that will be presented in a network (with lots of options for customisation and expansioN). This tool will again provide you with GO terms that are enriched in the network. A second tool of interest is Gorilla (http://cbl-gorilla.cs.technion.ac.il/) - this will show the GO hierarchy itself with GO terms lighting up if they are enriched.

Upvotes: 1

Related Questions