bioknup
bioknup

Reputation: 1

Open and Parse multiple .fasta files from a folder with a for loop and print/extract sequence (python)

I want to print the id and sequences of multiple .fasta files and additionally put them in an array but I got a problem with gaining access to the sequence itself. I played around with SeqIO from Biopython to parse the .fasta files and tried through os and glob to gain access to the files in the folder. What am I doing wrong here, I'm really struggling with the code since I don't really have a lot of programming experience. I don't get an error code here but there is also nothing printed. Any advice?

from Bio import SeqIO
import os,glob
folder_path = ('genome_nucseq_unique/data/')
for seq_record in SeqIO.parse(glob.glob(os.path.join(folder_path, '*.fasta')), "fasta"):
    print(seq_record.id)
    print(seq_record.id)

Upvotes: 0

Views: 1445

Answers (1)

BioGeek
BioGeek

Reputation: 22827

SeqIO.parse expects a str, bytes or os.PathLike object, not a list like glob.glob() returns. Modify your function like this:

from Bio import SeqIO
import os, glob
folder_path = 'genome_nucseq_unique/data/'
fasta_paths = glob.glob(os.path.join(folder_path, '*.fasta'))
for fasta_path in fasta_paths:
    print(fasta_path)
    for seq_record in SeqIO.parse(fasta_path, "fasta"):
        print(seq_record.id)
        print(seq_record.seq)
        print()

Upvotes: 1

Related Questions