Echchama Nayak
Echchama Nayak

Reputation: 933

Fixed number of results biopython

I am trying to retrieve the search results using the following code for a query from pubmed via biopython

from Bio import Entrez
from Bio import Medline

Entrez.email = "[email protected]"
LIM = 3


def search(Term):
    handle = Entrez.esearch(db='pmc', term=Term, retmax=100000000)
    record = Entrez.read(handle)
    idlist = record["IdList"]
    handle = Entrez.efetch(db='pmc', id=idlist, rettype="medline", retmode="text")
    records = Medline.parse(handle)
    return list(records)
mydic=search('(pathological conditions, signs and symptoms[MeSH Terms]) AND (pmc cc license[filter]) ')
print(len(mydic))

No matter how many times I try, I get 10000 in the output. Tried different queries but I still get 10000. When I manually check how many results via browser I get random numbers.

What exactly is going wrong and how to ensure that I get the maximum results?

Upvotes: 1

Views: 541

Answers (1)

Peter Cock
Peter Cock

Reputation: 1614

You only seem to be changing the esearch limit, but leave efetch alone (and the NCBI seems to default to a limit of 10000). You need to use the retstart and retmax arguments.

See the "Searching for and downloading abstracts using the history" example in the Biopython Tutorial, http://biopython.org/DIST/docs/tutorial/Tutorial.html or http://biopython.org/DIST/docs/tutorial/Tutorial.pdf

Upvotes: 2

Related Questions