Reputation: 140
<tree>
<item>
<element1>somedata</element1>
<element2>moredata</element2>
<element3>data?</element3>
<optional_element>data!</optional_element>
</item>
<item>
<element1>somedata</element1>
<element2>moredata</element2>
<element3>data?</element3>
</item>
<item>
<element1>somedata</element1>
<element2>moredata</element2>
<element3>data?</element3>
<optional_element>data!</optional_element>
</item>
<item>
<element1>somedata</element1>
<element2>moredata</element2>
<element3>data?</element3>
</item>
</tree>
I have an XML document like this one, what i am trying to accomplish is to get this kind of output:
["data!", "", "data!", ""]
instead of just ["data!", "data!"]
So far i have tried this approach without being able to make it work (the list will still just include elements that are present).
Upvotes: 1
Views: 146
Reputation: 473863
I would use findtext()
and specify the default
:
[item.findtext("optional_element", default="") for item in tree.findall("item")]
Demo (using lxml
):
>>> from lxml import etree
>>>
>>> data = """<?xml version="1.0" encoding="utf-8"?>
... <tree>
... <item>
... <element1>somedata</element1>
... <element2>moredata</element2>
... <element3>data?</element3>
... <optional_element>data!</optional_element>
... </item>
... <item>
... <element1>somedata</element1>
... <element2>moredata</element2>
... <element3>data?</element3>
... </item>
... <item>
... <element1>somedata</element1>
... <element2>moredata</element2>
... <element3>data?</element3>
... <optional_element>data!</optional_element>
... </item>
... <item>
... <element1>somedata</element1>
... <element2>moredata</element2>
... <element3>data?</element3>
... </item>
... </tree>
... """
>>>
>>> tree = etree.fromstring(data)
>>> print [item.findtext("optional_element", default="") for item in tree.findall("item")]
['data!', '', 'data!', '']
Upvotes: 3