Rober
Rober

Reputation: 6106

Parsing sub nodes with Nokogiri

I´m trying to parse this XML structure with Nokogiri's XPath.

<root>
  <resource id='1' name='name1>
     <prices>
         <price datefrom='2015-01-01' dateto='2015-05-31' price='3000' currency='EUR'></price>
         <price datefrom='2015-06-01' dateto='2015-12-31' price='4000' currency='EUR' ></price>                        
     </prices>
  </resource>
  <!-- many more resource nodes -->
<root>

I'm iterating each resource and for each resource, I need to get its <prices> elements:

resourcesParsed = Nokogiri::XML(resourcesXML)
    resources = resourcesParsed.xpath("//resource")      
      for resource in resources do
        id = resource["id"]
        # insert in resources tables
        # parsing resource prices
        getPrices(resource)
      end
    ...

def getPrices(resource)
  prices = resource.xpath("//price") 
  @logger.debug "prices=" + prices.to_s
  # do whatever
end 

For some reason, when I try to parse //price it's not getting only the <price> nodes inside the resource, but all the <prices> nodes in the whole XML document.

How can I parse only the <price> nodes of a resource?

Upvotes: 0

Views: 142

Answers (2)

the Tin Man
the Tin Man

Reputation: 160631

I'd write the code like:

resources = doc.search('resource').map{ |resource|
  [
    resource['id'],
    resource.search('price').map{ |price|
      {
        price:    price['price'],
        datefrom: price['datefrom'],
        dateto:   price['dateto'],
        currency: price['currency']
      }
    }
  ]
}

At this point resources is an array of arrays of hashes, each sub-array is a resource with its embedded prices:

# => [["1",
#      [{:price=>"3000",
#        :datefrom=>"2015-01-01",
#        :dateto=>"2015-05-31",
#        :currency=>"EUR"},
#       {:price=>"4000",
#        :datefrom=>"2015-06-01",
#        :dateto=>"2015-12-31",
#        :currency=>"EUR"}]]]

It'd be a little more easy to reuse that for lookups or further processing if it's a hash of sub-arrays, where each sub-array is a price:

resources.to_h
# => {"1"=>
#      [{:price=>"3000",
#        :datefrom=>"2015-01-01",
#        :dateto=>"2015-05-31",
#        :currency=>"EUR"},
#       {:price=>"4000",
#        :datefrom=>"2015-06-01",
#        :dateto=>"2015-12-31",
#        :currency=>"EUR"}]}

Upvotes: 1

Rober
Rober

Reputation: 6106

I got it.

Instead of:

prices = resource.xpath("//price") 

I should search:

prices = resource.xpath(".//price") 

To point to the current node.

Upvotes: 2

Related Questions