charliecharlie
charliecharlie

Reputation: 1

Trying to read file in FASTA format and then write to another file in Genbank format

Trying to read a file that contains a genome sequence using Seq and SeqIO objects in BioPython. Cannot use the open command. The program should accept a command-line argument containing the name of FASTA file containing the input genome.

It made the file, but there is nothing in the file. Not sure what I am missing?

This is what I have:

    from Bio.Seq import Seq                                                 
    from Bio import SeqIO
    from Bio.SeqRecord import SeqRecord
    from Bio.Alphabet import IUPAC

    recordlist = []

    for SeqRecord in SeqIO.parse('bacterium_genome.fna', 'fasta'):
        myseq = SeqRecord.seq
        myseq.alphabet = IUPAC.unambiguous_dna
        recordlist.append(SeqRecord)


    SeqIO.write(recordlist, 'bacterium_genome.gb', 'gb')

Upvotes: 0

Views: 793

Answers (1)

Chris_Rands
Chris_Rands

Reputation: 41168

What you're doing should actually work (assuming a valid non-empty input FASTA file) but is not that elegant with unnecessary imports. You could instead modify the alphabet directly and then write the sequence record to the output file handle each iteration:

from Bio import SeqIO
from Bio.Alphabet import IUPAC

with open('bacterium_genome.gb', 'w') as out_f:
    for record in SeqIO.parse('bacterium_genome.fna', 'fasta'):
        record.seq.alphabet = IUPAC.unambiguous_dna
        SeqIO.write(record, out_f, 'genbank')

Upvotes: 2

Related Questions