Rober
Rober

Reputation: 6106

How to get a specific XML node with Nokogiri and XPath

I have this structure in XML:

<resource id="2023984310000103605" name="Rebelezza">
      <prices>
         <price datefrom="2019-10-31" dateto="2019-12-31" price="2690.0" currency="EUR" />
         <price datefrom="2020-01-01" dateto="2020-03-31" price="2690.0" currency="EUR" />
         <price datefrom="2020-03-31" dateto="2020-04-30" price="3200.0" currency="EUR" />
      </prices>                   
      <products>
         <product name="specific-product1">
            <prices>
               <price datefrom="2019-10-31" dateto="2019-12-31" price="2690.0" currency="EUR" />
               <price datefrom="2020-01-01" dateto="2020-03-31" price="2690.0" currency="EUR" />
               <price datefrom="2020-03-31" dateto="2020-04-30" price="3200.0" currency="EUR" />              
            </prices>
         </product>
      </products>
</resource>

How can I get only the prices under resources without getting the prices inside products using an XPath selector.

At the moment, I have something like:

resources = resourcesParsed.xpath("//resource")
for resource in resources do
  prices = resource.xpath(".//prices/price[number(translate(@dateto, '-', '')) >= 20190101]")
end

However, I am getting both, the prices directly under resource element and also under products. I'm not interested in the prices under products.

Upvotes: 0

Views: 263

Answers (2)

the Tin Man
the Tin Man

Reputation: 160631

I'd do it this way:

require 'nokogiri'
doc = Nokogiri::XML(<<EOT)
<resource>
      <prices>
         <price price="1"/>
      </prices>                   
      <products>
         <product>
            <prices>
               <price price="-1"/>
            </prices>
         </product>
      </products>
</resource>
EOT

doc.search('resource > prices > price').map { |p| p['price'] }
# => ["1"]

This won't find price nodes under products or product because it wasn't specified in the selector, which, in CSS-ese means "find the resource node then the prices node then the price nodes". Anything not in that path is ignored.

The majority of time I find CSS selectors easier to write, understand, and less visually noisy. Even the Nokogiri docs recommend using CSS for those reasons.

Upvotes: 1

E.Wiest
E.Wiest

Reputation: 5915

2 options with XPath :

.//price[parent::prices[parent::resource]]
.//price[ancestor::*[2][name()="resource"]]

Output : 3 nodes

And to add a date condition, you can use what you did :

.//price[parent::prices[parent::resource]][translate(@dateto, '-', '') >= 20200101]

Upvotes: 1

Related Questions