Reputation: 1
I have an XML file, I need to extract all text inside the feature tag in Python
<person>
<text id="1">
<title>
student
</title>
<feature>
xxxx
<name>yyyy</name>
zzzz
<country>dddd</country>
ffff
</feature>
my code is this:
for person in tree.iter():
for text in person:
for feature in text:
if feature.tag=="feature":
print(feature.text)
It just shows me the "xxxx" but my ideal answer is xxxx yyyy zzzz dddd ffff
Upvotes: 0
Views: 3647
Reputation: 5440
Of course there's a line missing at the end (</person>
). and you should comment which library you are using, if any
If you use a library to 'parse' the xml into a tree structure, say xml.etree.ElementTree
, you fairly easily extract tags, attributes and even text by the query functions of the library. You can do so in the order you want, and create a result in you desired format.
xml.etree.ElementTree
is part the Python standard library. Have a look at the Python ElementTree documentation. There are plenty examples.
Upvotes: 1