Haohan Wang
Haohan Wang

Reputation: 607

similarity of genes given gene name, in BioPython

How can I find the similarity of two genes, given the gene name? By similarity, I think I mean the similarity of the sequences. I am new to this area and given this work by my professor. I do not know many types of similarity

Hopefully, can this be done with Biopython?

Thank you so much.

Update as response:
Thanks. But I tried.
My main problem is when I retrieve gene sequence from database, some results come as a sequence of gene, others come out as a sequence of proteins. I think if we want to compare them, I need make sure they are all gene sequences or they are all protein sequences right?

Here is the code I use:

 handle = Entrez.efetch(db="nucleotide", id=t ,rettype="gb")
 record = handle.read()

Then, for some ids, I got a sequence of agtc, others I got a sequence like mwvllvffll tltylfwpkt. They are proteins right?

I got stuck here and I do not know what to do next.

Upvotes: 0

Views: 762

Answers (2)

mehmet
mehmet

Reputation: 110

If youre really into this you should learn the meanings of e-values' scores etc.Like high scores and low e-values corresponds to better similarities.

You must compare the same types but if you like to compare nucleotides to proteins anyway first translate dna to protein.

Take a look at NCBI,ENSEMBL,EBI websites.They provide you almost all the tools you need.

If you have lots of sequences to be compared it will be wise to use biopython but first understand the cookbook as MattDMo said.Look around over the internet see how other programmers did it try to understand their codes.

Good luck

Upvotes: 0

MattDMo
MattDMo

Reputation: 102902

You should start off by reading through the Biopython Tutorial, which covers all of the basics. Your problem is pretty straightforward (assuming you already know how to program in Python): Read in the gene name or accession ID, retrieve the sequences, align the sequences, then generate summary information (percent identity, percent homology, gap score, etc.). All of these functions are covered in the tutorial and the cookbook. The Biopython API documentation is also very helpful when working with the individual classes and methods.

Good luck!

Upvotes: 1

Related Questions