Reputation: 631
I am using urllib and ElementTree to parse XML API calls from pubmed.
An example of this is:
#Imports Modules that can send requests to URLs
#Python Version 3.4 Using IEP (Interactive Editor for Python) as IDE
import urllib.request
import urllib.parse
import re
import xml.etree.ElementTree as ET
from urllib import request
#Obtain API Call and assign Element Object to Root
id_request = urllib.request.urlopen('http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=pubmed&id=1757056')
id_pubmed = id_request.read()
root = ET.fromstring(id_pubmed)
I now have been able to use Element Tree to import the data to the object root from ET.fromstring. My issue now, is that I am having trouble finding interesting elements from this object.
I am referring to: https://docs.python.org/2/library/xml.etree.elementtree.html and my XML format looks like: http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi?db=pubmed&id=1757056
I have tried:
#Parse Attempts. Nothing returned.
for author in root.iter('Author'):
print (author.attrib)
As well as
#No Return for author
for author in root.findall('Id'):
author = author.find('author').text
print (author)
Upvotes: 0
Views: 379
Reputation: 1576
Try to iterate by the tag
for author in root.iter('Item'):
if author.attrib['Name'] == 'Author':
print("Success")
Or:
author_list = [x for x in root.iter('Item') if x.attrib['Name'] == 'Author']
I don't know if you can iterate by attribute
Upvotes: 1
Reputation: 293
The .attrib
method returns the value inside of a tag. I think you may want to use either .tag
or .text
instead. I'm not exactly sure what data you are trying to pull from this tree, but you can also loop over the author
value.
Edit:
Well the esummaryResult tag seems pointless, unless you will have more DocSum tags. But the information you want is in your .text
value. Try printing author.tag
and maybe you can check the values returned for what you currently are iterating over.
Upvotes: 0