Reputation: 1
I want to print the id and sequences of multiple .fasta
files and additionally put them in an array but I got a problem with gaining access to the sequence itself. I played around with SeqIO from Biopython to parse the .fasta
files and tried through os and glob to gain access to the files in the folder. What am I doing wrong here, I'm really struggling with the code since I don't really have a lot of programming experience. I don't get an error code here but there is also nothing printed. Any advice?
from Bio import SeqIO
import os,glob
folder_path = ('genome_nucseq_unique/data/')
for seq_record in SeqIO.parse(glob.glob(os.path.join(folder_path, '*.fasta')), "fasta"):
print(seq_record.id)
print(seq_record.id)
Upvotes: 0
Views: 1445
Reputation: 22827
SeqIO.parse
expects a str
, bytes
or os.PathLike
object, not a list
like glob.glob()
returns. Modify your function like this:
from Bio import SeqIO
import os, glob
folder_path = 'genome_nucseq_unique/data/'
fasta_paths = glob.glob(os.path.join(folder_path, '*.fasta'))
for fasta_path in fasta_paths:
print(fasta_path)
for seq_record in SeqIO.parse(fasta_path, "fasta"):
print(seq_record.id)
print(seq_record.seq)
print()
Upvotes: 1