user1130601
user1130601

Reputation: 237

Python - Parse Single Line from XML

Hopefully this is a quick answer for those experienced. I have a single XML file that contains a URL in it, I'd like to take the URL from the XML and then input it into a downloader script I've written. My only issue is I can't seem to properly parse JUST the url from the XML. This is what it looks like:

<program new-version="1.1.1.1" name="ProgramName">
<download-url value="http://website.com/file.exe"/>
</program>

Thanks in advance!

Upvotes: 1

Views: 7901

Answers (2)

RanRag
RanRag

Reputation: 49567

 import lxml
 from lxml import etree
 et = etree.parse("your xml file or url")
 value = et.xpath('//download-url/@value')
 print "".join(value)

output = 'http://website.com/file.exe'

you can also use cssselect

 f = open("your xml file",'r')
 values = f.readlines()
 values = "".join(values)
 import lxml.html
 doc = lxml.html.fromstring(values)
 elements = doc.cssselect('document program download-url') //csspath using firebug
 elements[0].get('value')

output = 'http://website.com/file.exe'

Upvotes: 1

phihag
phihag

Reputation: 287835

>>> code = '''<program new-version="1.1.1.1" name="ProgramName">
... <download-url value="http://website.com/file.exe"/>
... </program>'''

With lxml:

>>> import lxml.etree
>>> lxml.etree.fromstring(code).xpath('//download-url/@value')[0]
'http://website.com/file.exe'

With the built-in xml.etree.ElementTree:

>>> import xml.etree.ElementTree
>>> doc = xml.etree.ElementTree.fromstring(code)
>>> doc.find('.//download-url').attrib['value']
'http://website.com/file.exe'

With the built-in xml.dom.minidom:

>>> import xml.dom.minidom
>>> doc = xml.dom.minidom.parseString(code)
>>> doc.getElementsByTagName('download-url')[0].getAttribute('value')
u'http://website.com/file.exe'

Which one you pick is entirely up to you. lxml needs to be installed, but is the fastest and most feature-rich library. xml.etree.ElementTree has a funky interface, and its XPath support is limited (depends on the version of the python standard library). xml.dom.minidom does not support xpath and tends to be slower, but implements the cross-plattform DOM.

Upvotes: 9

Related Questions