Ger Cas
Ger Cas

Reputation: 2298

Get values from nested xml file with Ruby/Nokogiri

I'm trying to get values from the following xml file but I'm stuck since I'm not getting the output as I´d like. May be somebody could help me with this.

My current code is:

require 'nokogiri'

doc = Nokogiri.XML(xml)

d=doc.xpath("//NtrkData/Rutins//GT_Nmbbrs/RngeDat")
puts d.xpath("//EE").text + "-" + d.xpath("//PR").text + "-" + d.xpath("//Brng").text + "-" + d.xpath("//Erng").text

I'm getting this output

3Z94PL-45156-73359-86353

but what I'd like to get is the value of element EE, PR, Brng (if exists) and Erng (if exists). All 4 values in the same line.

So for the following xml the output I'm looking for would be:

3Z9 45 
4PL 156 73359 86353

The xml is:

xml =<<_
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<main>
<Oganin>
    <Oganna>EJ-MKKL</Oganna>
    <CutryI>YUFG</CutryI>
    <Ntwl>
    <Ntrk>
        <TGCo>KOLPWE</TGCo>
        <NtrkType>Uymmls</NtrkType>
        <NtrkData>
        <Rutins>
            <Rutinf>
            <CTT>
                <GT_Nmbbrs>
                <RngeDat>
                    <Nmbbr>
                    <EE>3Z9</EE>
                    <PR>45</PR>
                    </Nmbbr>
                </RngeDat>
                <RngeDat>
                    <Nmbbr>
                    <EE>4PL</EE>
                    <PR>156</PR>
                    <Srng>
                        <Brng>73359</Brng>
                        <Erng>86353</Erng>
                    </Srng>
                    </Nmbbr>
                </RngeDat>
                </GT_Nmbbrs>
            </CTT>              
            </Rutinf>
        </Rutins>
        </NtrkData>
    </Ntrk>
    </Ntwl>
</Oganin>
</main>
_

Upvotes: 0

Views: 565

Answers (1)

Aleksei Matiushkin
Aleksei Matiushkin

Reputation: 121000

Nokogiri has a perfect documentation and it clearly states that Nokogiri::XML::NodeSet#inner_text does not do what you expect. Instead, it joins the text nodes values.

Also, there is no way to just map(&:text) as suggested in the documentation, because you likely want to keep a belonging of <Srng> children, which would be obviously impossible within bulk querying.

That said, you need to query respective parents and iterate children:

d.xpath('//Nmbbr').
  map do |node|
    [
      node.xpath("./EE"),
      node.xpath("./PR"),
      node.xpath("./Srng").map do |node|
         %w[Brng Erng].map { |path| node.xpath("./#{path}") }
      end
    ]
  end.
  map { |nodes| nodes.flatten.map(&:text) }
  #⇒ [["3Z9", "45"], ["4PL", "156", "73359", "86353"]]

Now iterate the result and print it as you want.

Upvotes: 2

Related Questions