Reputation: 451
quick question -- first time using biopython and I'm just trying to jury-rig something real quick based on tutorials.
I can't seem to get Entrez.efetch()
to return mesh terms for a given article, and the only method appears to be what I'm doing, namely:
handle = Entrez.efetch(db="pubmed", id=pmids, rettype="medline", retmode="xml")
records = Entrez.read(handle)
where pmids is a list of pubmed IDs
this returns the following: http://pastie.org/5459700
I've tried tweaking the rettype and retmode parameters per http://www.ncbi.nlm.nih.gov/books/NBK25499/ with no luck. anything obvious I'm missing?
Upvotes: 2
Views: 2164
Reputation: 16521
This works for me:
from Bio import Entrez # install with 'pip install biopython'
from Bio.Entrez import efetch, read
Entrez.email = "[email protected]" # register your email
def get_mesh(pmid):
# call PubMed API
handle = efetch(db='pubmed', id=str(pmid), retmode='xml')
xml_data = read(handle)[0]
# skip articles without MeSH terms
if u'MeshHeadingList' in xml_data['MedlineCitation']:
for mesh in xml_data['MedlineCitation'][u'MeshHeadingList']:
# grab the qualifier major/minor flag, if any
major = 'N'
qualifiers = mesh[u'QualifierName']
if len(qualifiers) > 0:
major = str(qualifiers[0].attributes.items()[0][1])
# grab descriptor name
descr = mesh[u'DescriptorName']
name = descr.title()
yield(name, major)
# example output
for name, major in get_mesh(128):
print '{}, {}'.format(name, major)
Upvotes: 7
Reputation: 15315
This question is best asked on the Biopython mailing list or perhaps the http://www.biostars.org/ . It's much more likely there to find people with Entrez experience.
The problem is that the record with PMID 23165874 doesn't have any MeSH terms. Compare the raw XML of that record to one which has MeSH terms. The latter has a section starting:
<MeshHeadingList>
<MeshHeading>
<DescriptorName MajorTopicYN="N">ADP Ribose Transferases</DescriptorName>
<QualifierName MajorTopicYN="Y">genetics</QualifierName>
</MeshHeading>
<MeshHeading>
<DescriptorName MajorTopicYN="N">Acinetobacter</DescriptorName>
<QualifierName MajorTopicYN="Y">drug effects</QualifierName>
<QualifierName MajorTopicYN="Y">genetics</QualifierName>
</MeshHeading>
..
In other words, it's hard to get something which doesn't exist.
Upvotes: 2