pjain
pjain

Reputation: 683

Find an xml element with some specific text using xpath or find in python using lxml

I am trying to find all book elements with value abc i.e. name tag value. I used xpath:

val= xml1.xpath('//bookstore/book/name[text()="abc"]')

But it is returning None.

<bookstore>
 <book>
   <name>abc</name>
   <price>30</price>
 </book>
 <book>
   <name>Learning XML</name>
   <price>56</price>
 </book>
</bookstore> 

Upvotes: 4

Views: 7290

Answers (2)

mzjn
mzjn

Reputation: 50937

Here is one way to do it:

from lxml import etree

# Create an ElementTree instance 
tree = etree.parse("bookstore.xml")  

# Get all 'book' elements that have a 'name' child with a string value of 'abc'
books = tree.xpath('book[name="abc"]')

# Print name and price of those books
for book in books:
    print book.find("name").text, book.find("price").text

Output when using the XML in the question:

abc 30

Upvotes: 3

Vivek Sable
Vivek Sable

Reputation: 10213

Added Id attribute to book tag.

root.xpath("//bookstore/book/name[text()='abc'] it will give list of all name element which text is abc not parent element.

check following:

>>> data = """<bookstore>
...  <book id="1">
...    <name>abc</name>
...    <price>30</price>
...  </book>
...  <book id="2">
...    <name>Learning XML</name>
...    <price>56</price>
...  </book>
... </bookstore> """
>>> root = PARSER.fromstring(data)
>>> root.xpath("//bookstore/book")
[<Element book at 0xb726d144>, <Element book at 0xb726d2d4>]
>>> root.xpath("//bookstore/book/name[text()='abc']")
[<Element name at 0xb726d9b4>]
>>> root.xpath("//bookstore/book/name[text()='abc']/parent::*")
[<Element book at 0xb726d7d4>]
>>> root.xpath("//bookstore/book/name[text()='abc']/parent::*")[0].attrib
{'id': '1'}

Python beginner:

  1. Create Parsing Object from that data.
  2. Define parent list variable.
  3. Iterate on name tags.
  4. Check text of name tag is equal to abc.
  5. if yes, then get parent of name tag and append to list variable.
  6. Display result:

code:

>>> root = PARSER.fromstring(data)
>>> abc_parent = []
>>> for i in root.getiterator("name"):
...    if i.text=="abc":
...        abc_parent.append(i.getparent())
... 
>>> print abc_parent
[<Element book at 0xb726d2d4>]
>>> abc_parent[0].attrib
{'id': '1'}

Upvotes: 3

Related Questions