PepeHands
PepeHands

Reputation: 1406

How to use lxml in Python 3?

In my project I need to parse an XML document using lxml.etree. I'm novice in Python, so I can't understand, how to find all categories with some tag. Let's describe it more accurately.

I have an XML like:

<cinema>
  <name>BestCinema</name>
  <films>
    <categories>
      <category>Action</category>
      <category>Thriller</category>
      <category>Soap opera</category>
    </categories>
  </films>
</cinema>

Now I need to get the list of all categories. In this case it will be:

      <category>Action</category>
      <category>Thriller</category>
      <category>Soap opera</category>

I need to use:

tree = etree.parse(file)

Thank you, any help is welcome.

Upvotes: 3

Views: 12056

Answers (1)

mata
mata

Reputation: 69012

it should be as simple as:

from lxml import etree
el = etree.parse('input.xml')
categories = el.xpath("//category")
print(categories)
...

Everything else you should find in the tutorial.

Upvotes: 3

Related Questions