felix89
felix89

Reputation: 1

Pentaho - XPath Select all children with specific parent

I'm using Pentaho 'Get data from XML'.

I want to select all children <price> whose parent <book> name is '1.1'.

<bookstore>
  <book name = '1.1'>
    <title lang="en">Learning XML</title>
    <price>29.99</price>
    <price>39.99</price>
    <price>59.99</price>
  </book>

  <book name = '1.2'>
    <title lang="en">Harry Potter</title>
    <price>39.95</price>
  </book>
</bookstore>

The configuration I put with the step are this:

Configuration Step Content

Configuration Step Fields

And the result I got are the following:

Result

If I change the 'Loop Xpath' in Content like: /bookstore/book/price I got 4 rows of the same first price (29.99).

Upvotes: 0

Views: 2008

Answers (3)

felix89
felix89

Reputation: 1

Thank you Tomalak! It's work. I tried to put /bookstore/book[@name='1.1']/* Wasmachien but in pentaho doesn't work, but thanks for the answer!

The key was, "In "Fields" you normally set up the data fields to be extracted from each of those items. Therefore the XPath should be relative here."

Now I tried to select all specifed children node with //book[@name = '1.1'], but if in the parent book have i.e. 'price' and 'discount' too. Tried with . or price in the relative paths doesn't works, it's return just the first child again. It's possible to use //book[@name = '1.1'] and get all the children with their respective nodes?

Thanks

Upvotes: 0

Tomalak
Tomalak

Reputation: 338326

In "Content" you set the "loop XPath" to /bookstore/book, so you will end up with a loop over (in this example) two items - the one you want, and the other one.

In "Fields" you normally set up the data fields to be extracted from each of those items. Therefore the XPath should be relative here.

But you used //book[@name = '1.1']/price, which is an absolute path. It selects three items, of which Pentaho can only take the first one to populate a field. That's why you get 29.99 two times.

What to do? It's always the same approach.

To get general information on all books:

  • Select the right items in the "loop" part: //book
  • Select the field values using relative paths: ./price[1] and probably ./title

To get general information on one specific book:

  • Select the right item in the "loop" part: //book[@name = '1.1']
  • Select the field values using relative paths: ./price[1] and probably ./title

To get prices of one specific book:

  • Select the right items in the "loop" part: //book[@name = '1.1']/price
  • Select the field values using relative paths: ./text() (or simply .)

Upvotes: 1

wasmachien
wasmachien

Reputation: 1009

Try /bookstore/book[@name='1.1']/*

Upvotes: 0

Related Questions