user1140126
user1140126

Reputation: 2649

Parsing xml data in python

I am trying to parse an xml string in python. I am searching for specific tag ops:cpc in the string. How can I get the actual value? In the example given below, the expected result is A61K9/00.

content = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>\n<ops:world-patent-data xmlns:ops="http://ops.epo.org" xmlns:reg="http://www.epo.org/register" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:cpc="http://www.epo.org/cpcexport" xmlns:cpcdef="http://www.epo.org/cpcdefinition">\n    <ops:meta name="elapsed-time" value="20"/>\n    <ops:classification-scheme>\n        <ops:mappings inputSchema="ECLA" outputSchema="CPC">\n            <ops:mapping additional-only="false">\n                <ops:ecla>A61K9/00</ops:ecla>\n                <ops:cpc xlink:href="classification/cpc/A61K9/00">A61K9/00</ops:cpc>\n            </ops:mapping>\n        </ops:mappings>\n    </ops:classification-scheme>\n</ops:world-patent-data>\n'

xmldoc = minidom.parseString(content)
itemlist = xmldoc.getElementsByTagName('ops:cpc')
print len(itemlist)

Upvotes: 0

Views: 216

Answers (1)

alko
alko

Reputation: 48287

Use nodeValue property for a child text node:

>>> itemlist[0].childNodes[0].nodeValue
u'A61K9/00'

Upvotes: 2

Related Questions