user15708301
user15708301

Reputation: 43

Extracting Sequence from fasta file using coordinates list

I have a list of sequence starting coordinates and I wanted to retrieve those sequences from the genome fasta file which coordinates are present in the list. I tried using grep and in R but didn't get desired output

list of coordinates

10001276
10001433
10002237
10002342
10002617
10002736
10003584
10003832
10005377
1000567

which option would be efficient?

Upvotes: 0

Views: 868

Answers (1)

YotamW Constantini
YotamW Constantini

Reputation: 410

I suggest you try BioPython:

from Bio import SeqIO
record = SeqIO.read("NC_006581.gbk", "genbank")
print("\nPosition 10001276: ", record.seq[10001276,10001276+1])

Upvotes: 0

Related Questions