Reputation: 93
import xml.dom.minidom
water = """
<channel>
<item>
<title>water</title>
<link>http://www.water.com</link>
</item>
<item>
<title>fire</title>
<link>http://www.fire.com</link>
</item>
</channel>"""
dom=xml.dom.minidom.parseString(water)
linklist = dom.getElementsByTagName('link')
print (len(linklist))
Using minidom, I want to get the content between link and /link as a string. Please let me know how to.
Upvotes: 2
Views: 1743
Reputation: 7809
If you want to stick with xml.dom.minidom just call .firstChild.nodeValue. For example, you stored the links in the variable "linklist", so to print them simply iterate through them and call .firstChild.nodeValue, like this...
for link in linklist:
print link.firstChild.nodeValue
prints...
http://www.water.com
http://www.fire.com
More detailed answer here.... Get Element value with minidom with Python
In response to your other question:
If you wanted to get a specific element you would need to know either where it is in the document or search for it.
For example, if you knew the link you wanted was the second link in the xml document you would do...
# the variable fire_link is a DOM Element of the second link in the xml file
fire_link = linklist[1]
However, if you wanted the link but do not know where it is in the document, you would have to search for it. Here is an example...
# fire_link is a list where each element is a DOM Element containing the http://www.fire.com link
fire_links = [l for l in linklist if l.firstChild.nodeValue == 'http://www.fire.com']
# take the first element
fire_link = fire_links[0]
Upvotes: 2
Reputation: 328594
This is more complicated than it looks. From the examples in the documentation, append this to the code in your question:
def getText(nodelist):
rc = []
for node in nodelist:
if node.nodeType == node.TEXT_NODE:
rc.append(node.data)
return ''.join(rc)
text = getText(linklist[0].childNodes)
print text
I suggest to try the elementtree
module where the code would be:
print linklist[0].text
Upvotes: 1