Thom Vessies
Thom Vessies

Reputation: 11

Is there an easy way to download genomes in fasta format from NCBI using Python?

I'm trying to download genomes from NCBI (preferably in fasta format) using Python, but so far nothing really works. API's are new to me and I don't really understand the documentation (https://www.ncbi.nlm.nih.gov/books/NBK25497/).

My eventual goal is downloading all genomes of every species within a genus, but downloading just 1 genome with Python would be a great start.

I'm also open to options other than using an API.

Thanks in advance :)

Edit: This is my code sample

import ncbi_genome_download as ngd

taxon_name = "Rubus"
ngd.download().group(taxon_name)

This downloads of genomic data in the archaea group, but not in the group that I'm interested in: Rubus.

Upvotes: 0

Views: 1633

Answers (1)

Vovin
Vovin

Reputation: 770

Yes, there is such easy way :-)

from Bio import Entrez

Entrez.email = "[email protected]"
Entrez.api_key = "y0ur_ap1_key"

IDs = Entrez.read(Entrez.esearch(db="nucleotide", retmax=3, term="Procyon lotor", field="Organism"))["IdList"]
for ID in IDs:
    print(Entrez.efetch(db="nucleotide", id=ID, rettype="fasta", retmode="text").read())

Output:

>MK804320.1 Procyon lotor voucher MNHN:TC793 cytochrome b (CYTB) gene, partial cds; mitochondrial
GGGCAACAGTAATTACAAACCTCCTGTCAGCTATCCCCTATATCGGATCTAACCTTGTAGAATGAATTTG
AGGAGGGTTTTCAGTAGACAAAGCCACCCTAACACGATTCTTCGCATTCCACTTCATTCTACCATTTATT
ATCACAGCGCTAGCAATAATTCACCTGCTATTCCTACACGAAACAGGATCCAATAACCCTTCTGGAATTA
CATCAGAATCTGACAAAATTCCATTTCACCCATACTACACCATTAAAGACATTCTGGGAATCCTATTCCT
TATTTTTGTACTTATAGGTTTAGTGCTATTTACGCCAGACCTACTAGGTGACCCAGATAATTACACACCC
GCTAACCCCTTAAACACCCCACCTCACATTAAACCTGAATGATATTTTCTATTCGCCTACGCAATTCTAC
GTTCCATTCCCAACAAACTAGGAGGAGTCCTAGCCCTAGTCCTCTCCATCTTAATCCTAATCATCATTCC
ACTCCTACACACCTCAAAACAACGAAGCATAATATTTCGGCCACTTAGCCAATGTTTATTCTGATTCCTA
GTAGCAGACCTCCTCGTCCTAACATGAATTGGAGGTCAACCAGTAGAATATCCCTTCATCATCATCGGCC
AACTAGCCTCCATCTTCTACTTTATAATCCTCCTGATCCTAATACCAACAATAAATATCATCGAAAATAA
TCTGTTAAAATGAAGA


>MK804319.1 Procyon lotor voucher MNHN:TC792 cytochrome b (CYTB) gene, partial cds; mitochondrial
GGGCAACAGTAATTACAAACCTCCTGTCAGCTATCCCCTATATCGGATCTAACCTTGTAGAATGAATTTG
AGGAGGGTTTTCAGTAGACAAAGCCACCCTAACACGATTCTTCGCATTCCACTTCATTCTACCATTTATT
ATCACAGCGCTAGCAATAATTCACCTGCTATTCCTACACGAAACAGGATCCAATAACCCTTCTGGAATTA
CATCAGAATCTGACAAAATTCCATTTCACCCATACTACACCATTAAAGACATTCTGGGAATCCTATTCCT
TATTTTTGTACTTATAGGTTTAGTGCTATTTACGCCAGACCTACTAGGTGACCCAGATAATTACACACCC
GCTAACCCCTTAAACACCCCACCTCACATTAA


>MK804318.1 Procyon lotor voucher MNHN:TC791 cytochrome b (CYTB) gene, partial cds; mitochondrial
GGGCAACAGTAATTACAAACCTCCTGTCAGCTATCCCCTATATCGGATCTAACCTTGTAGAATGAATTTG
AGGAGGGTTTTCAGTAGACAAAGCCACCCTAACACGATTCTTCGCATTCCACTTCATTCTACCATTTATT
ATCACAGCGCTAGCAATAATTCACCTGCTATTCCTACACGAAACAGGATCCAATAACCCTTCTGGAATTA
CATCAGAATCTGACAAAATTCCATTTCACCCATACTACACCATTAAAGACATTCTGGGAATCCTATTCCT
TATTTTTGTACTTATAGGTTTAGTGCTATTTACGCCAGACCTACTAGGTGACCCAGATAATTACACACCC
GCTAACCCCTTAAACACCCCACCTCACATTAAACCTGAATGATATTTTCTATTCGCCTACGCAATTCTAC
GTTCCATTCCCAACAAACTAGGAGGAGTCCTAGCCCTAGTCCTCTCCATCTTAATCCTAATCATCATTCC
ACTCCTACACACCTCAAAACAACGAAGCATAATATTTCGGCCACTTAGCCAATGTTTATTCTGATTCCTA
GTAGCAGACCTCCTCGTCCTAACATGAATTGGAGGTCAACCAGTAGAATATCCCTTCATCATCATCGGCC
AACTAGCCTCCATCTTCTACTTTATAATCCTCCTGATCCTAATACCAACAATAAATATCATCGAAAATAA
TCTGTTAAAATGAAGA

Upvotes: 0

Related Questions