user3224522
user3224522

Reputation: 1151

Translating a FASTA file of CDS to proteins taking into account open reading frames

I have a FASTA file with nucleotide sequences. I would need to translate them to proteins, but taking into account 3 reading frames (i.e for +1 'ATG',+2 'TG',+3 'G'). This simple code using BioPython makes a perfect job if reading frame is +1. But for the remaining two it gives different translation. Is there a way to specify in BioPython the reading frames?

Input file

>contig20
TGGATCGGCGAGACCGACTCCGAGCGCGCCGACGTCGCCAAGGGATGGGCGTCCCTCCAGGTAAACCAACCCT
CTTCCCATCAAATTCTTTTTACCATGCAATATAGTCGTCGGTGTCGATCACTGTCATGCATATGGATTGGATT
AAACATGTCGCGGTCTCGTCGTTGCACGTTTCTTTCTTGCTTAACCACCTACCAATAGCAGCTGGTTGTAGCT
AGGTCGCTGCTGGGGATTGAAATCTTCAGCTTTAAGATGACAGCGACGACGCCATGGTCGGTCGCCCGGTCGT
GATCACCTACTCCAATTTACTGGAAAAATGATGATTTGTAAACGTGCATGCATGTTCCTTCAACCTTTTGTTA

Desired Output file

>contig20 Translated - Frame 3
DRRDRLRARRRRQGMGVPPGKPTLFPSNSFYHAI*SSVSITVMHMDWIKHVAVSSLHVSFLLNHLPIAAG
CS*VAAGD*NLQL*DDSDDAMVGRPVVITYSNLLEK**FVNVHACSFNLLL

Script

from Bio.SeqRecord import SeqRecord
def make_protein_record(nuc_record):
"""Returns a new SeqRecord with the translated sequence (default table)."""
    return SeqRecord(seq = nuc_record.seq.translate(), \
                 id = "trans_" + nuc_record.id, \
                 description = "translation of CDS, using default table")

from Bio import SeqIO
proteins = (make_protein_record(nuc_rec) for nuc_rec in \
        SeqIO.parse("file.fasta", "fasta"))
SeqIO.write(proteins, "translations.fasta", "fasta")

Upvotes: 1

Views: 813

Answers (1)

wflynny
wflynny

Reputation: 18521

Can't you simply do

from Bio.Seq import translate

contig = 'TGGATCGGCGAGACCGACTCCGAGCGCGCCGACGTCGCCAAGGGATGGGCGTCCCTCCAGGTAAACCAACCCT'
print 'ORF 1'
print translate(contig)
print 'ORF 2'
print translate(contig[1:])
print 'ORF 3'
print translate(contig[2:])

which would yield

'ORF 1'
'WIGETDSERADVAKGWASLQVNQP'
'ORF 2'
'GSARPTPSAPTSPRDGRPSR*TNP'
'ORF 3'
'DRRDRLRARRRRQGMGVPPGKPT'

Upvotes: 1

Related Questions