Tony Tannous
Tony Tannous

Reputation: 14876

Extract data based on tag attribute

Given the following input:

<tag1>
    <tag2 id="value">
        <tag3>
            text
        </tag3>
        <tag4>
            text
        </tag4>
    </tag2>
</tag1>

I would like to extract the text inside tag3 if the input equals the value.

So far I am able to extract the text regardless of value

tree = ET.parse(inFile)
text_file = open('output.txt', "w")
for p in root.iter('tag3')
    text_file.write(p.text + "\n")
text_file.close() 

But somehow I can't go up and find the value of the attribute in tag2.

Upvotes: 0

Views: 50

Answers (1)

Arun
Arun

Reputation: 1179

You can do it with BeautifulSoup

from bs4 import BeautifulSoup

data = open('data.xml').read()
d = BeautifulSoup(data)
print d.find('tag3').getText()

Upvotes: 2

Related Questions