El David
El David

Reputation: 395

Traceback KeyError when Entrez increases retmax

I am attempting to gather a list of pubmed articles using biopython entrez. I want to gather certain parts of the article from the medline format. The code I have wrote below works if there is no retmax set. It defaults to 20 articles, however, I want to gather a much larger amount of articles. If I set the retmax to a higher number I receive the error below.

#!/usr/bin/env python
from Bio import Entrez, Medline

Entrez.email = "[email protected]"    
handle = Entrez.esearch(db="pubmed",
                        term="stanford[Affiliation]", retmax=1000)
record = Entrez.read(handle)
pmid_list = record["IdList"]

more_data = Entrez.efetch(db="pubmed", id=",".join(pmid_list), rettype="medline", retmode="text")
all_records = Medline.parse(more_data)

record_list = []
for record in all_records:
    record_dict = {'ID': record['PMID'],
                    'Title': record['TI'],
                    'Publication Date': record['DP'],
                    'Author': record['AU'],
                    'Institute': record['AD']}
    record_list.append(record_dict)

I then receive the error

Traceback (most recent call last):
  File "./pubmed_pull.py", line 42, in <module> 
    'Institute': record['AD']}
KeyError: 'AD'

I am unsure why I get an error if I increase the number of articles.

Upvotes: 1

Views: 260

Answers (1)

Kyrubas
Kyrubas

Reputation: 897

Instead of grabing a key using dict[key] use dict.get(key). Doing this will return None if the key doesn't exist.

for record in all_records:
    record_dict = {'ID': record.get('PMID'),
                    'Title': record.get('TI'),
                    'Publication Date': record.get('DP'),
                    'Author': record.get('AU'),
                    'Institute': record.get('AD')}

Some further reading

Upvotes: 1

Related Questions