Lc_decg
Lc_decg

Reputation: 189

Error 'FastaIterator' object has no attribute 'records' in Biopython 1.85

Today, when I executed the following code, I suddenly got an error and could not execute the code Error 'FastaIterator' object has no attribute 'records' in Biopython. I have never had any errors before, so I'm so confused.

from Bio import __version__

print('\n\nBiopython Version : ', __version__, '\n\n')

from Bio import SeqIO


seq = SeqIO.parse(concensus_path, "fasta")

for record in seq.records:
    SeqIO.write(record, folder + '/' + record.name.split('(')[0].replace('_0_', '_') + '.fasta', "fasta")

The first part of the long script is to split a fasta file containing multiple dna sequences into fasta file containing a single dna sequence.

Is there any way to deal with these problems? Input fasta file has no problems at all. I tried with a file that was working fine before, but it also gave an error...

Upvotes: 0

Views: 66

Answers (1)

ronkosova
ronkosova

Reputation: 26

According to the docs here, you can access the records by just iterating over the returned iterator:

from Bio import __version__

print('\n\nBiopython Version : ', __version__, '\n\n')

from Bio import SeqIO

for record in SeqIO.parse("example.fasta", "fasta"):
    print(record.id)

From version 1.84 to 1.85:

SeqIO.parse(...) --> <class 'Bio.SeqIO.FastaIO.FastaIterator'> Object lost the records attribute that I think was just unpacking the iterator in memory***.

Try installing Biopython 1.84 with pip install -v biopython==1.84

and the for an input like:

fasta_test.fasta:

>DNA_sequence_1
GCAAAAGAACCGCCGCCACTGGTCGTGAAAGTGGTCGATCCAGTGACATCCCAGGTGTTGTTAAATTGAT
CATGGGCAGTGGCGGTGTAGGCTTGAGTACTGGCTACAACAACACTCGCACTACCCGGAGTGATAGTAAT
GCCGGTGGCGGTACCATGTACGGTGGTGAAGT

>DNA_sequence_2
TCCCAGCCAGCAGGTAGGGTCAAAACATGCAAGCCGGTGGCGATTCCGCCGACAGCATTCTCTGTAATTA
ATTGCTACCAGCGCGATTGGCGCCGCGACCAGGATCCTTTTTAACCATTTCAGAAAACCATTTGAGTCCA
TTTGAACCTCCATCTTTGTTC


>DNA_sequence_3
AACAAAAGAATTAGAGATATTTAACTCCACATTATTAAACTTGTCAATAACTATTTTTAACTTACCAGAA
AATTTCAGAATCGTTGCGAAAAATCTTGGGTATATTCAACACTGCCTGTATAACGAAACACAATAGTACT
TTAGGCTAACTAAGAAAAAACTTT

try to run:

from Bio import __version__

print('\n\nBiopython Version : ', __version__, '\n\n')

from Bio import SeqIO

import sys

concensus_path ='fasta_test.fasta'

seq = SeqIO.parse(concensus_path, "fasta")

print('\n\ntype(seq) : ', type(seq), '\n')

print('\n\nseq.records size : ', sys.getsizeof(seq.records),'\n\n')

print('\n\nseq. size : ', sys.getsizeof(seq),'\n\n')

and tell us if you see any difference

ADDENDUM:

***I was wrong seq.records returns a generator !!!!

try add more records to the fasta_test.fasta file and

compare the previous object size with:

seq = SeqIO.parse(concensus_path, "fasta")
recs = [i for i in seq.records]
# print(recs)
    
print('all records  size  : ' , sys.getsizeof(recs))

I think that seq.records is created in 1.84 in

biopython/Bio/SeqIO/Interfaces.py/class SequenceIterator :

....
....
try:
            self.records = self.parse(self.stream)
....
....

at __init__ of class SequenceIterator because of how FastaIterator is defined class FastaIterator(SequenceIterator) and SeqIO parse method returned objects.

In 1.85 class FastaIterator(SequenceIterator) lose its parse method too.

in 1.84 is at line 189 :

 def parse(self, handle):
        """Start parsing the file, and return a SeqRecord generator."""
        records = self.iterate(handle) ## iterate is the next method defined in the class
        return records

Upvotes: 1

Related Questions