azaytc
azaytc

Reputation: 63

Parse XML to Ruby objects and create attribute methods dynamically?

I need to parse an XML file to Ruby objects.

Is there a tool to read attributes from XML like this report.system_slots.items to return an array of item properties, or report.system_slots.current_usage to return 'Available'?

Is it possible to do this with Nokogiri?

<page Title="System Slots" H1="Property" H2="Value" __type__="2">
  <item Property="System Slot 1">
  <item Property="Name" Value="PCI1"/>
  <item Property="Type" Value="PCI"/>
  <item Property="Data Bus Width" Value="32 bits"/>
  <item Property="Current Usage" Value="Available"/>
  <item Property="Characteristics">
    <item Property="Vcc voltage supported" Value="3.3 V, 5.0 V"/>
    <item Property="Shared" Value="No"/>
    <item Property="PME Signal" Value="Yes"/>
    <item Property="Support Hot Plug" Value="No"/>
    <item Property="PCI slot supports SMBus signal" Value="Yes"/>
  </item>
</item>

Upvotes: 1

Views: 5804

Answers (1)

the Tin Man
the Tin Man

Reputation: 160551

Look at Ox. It reads XML and returns a reasonable Ruby object facsimile of the XML.

require 'ox'

hash = {'foo' => { 'bar' => 'hello world'}}

puts Ox.dump(hash)

pp Ox.parse_obj(Ox.dump(hash))

Dumping that into IRB gives me:

require 'ox'

 >   hash = {'foo' => { 'bar' => 'hello world'}}
{
    "foo" => {
        "bar" => "hello world"
    }
}

 >   puts Ox.dump(hash)
<h>
  <s>foo</s>
  <h>
    <s>bar</s>
    <s>hello world</s>
  </h>
</h>
nil

 >   pp Ox.parse_obj(Ox.dump(hash))
{"foo"=>{"bar"=>"hello world"}}
{
    "foo" => {
        "bar" => "hello world"
    }
}

That said, your XML sample is broken and won't work with OX. It WILL work with Nokogiri, though there are errors reported, which would hint that you wouldn't be able to parse the DOM correctly.

My question is, why do you want to convert the XML to an object? It is SO much easier to handle XML using a parser like Nokogiri. Using a fixed version of your XML:

require 'nokogiri'

xml = '
<xml>
<page Title="System Slots" H1="Property" H2="Value" __type__="2">
  <item Property="System Slot 1"/>
  <item Property="Name" Value="PCI1"/>
  <item Property="Type" Value="PCI"/>
  <item Property="Data Bus Width" Value="32 bits"/>
  <item Property="Current Usage" Value="Available"/>
  <item Property="Characteristics">
    <item Property="Vcc voltage supported" Value="3.3 V, 5.0 V"/>
    <item Property="Shared" Value="No"/>
    <item Property="PME Signal" Value="Yes"/>
    <item Property="Support Hot Plug" Value="No"/>
    <item Property="PCI slot supports SMBus signal" Value="Yes"/>
  </item>
</page>
</xml>'

doc = Nokogiri::XML(xml)

page = doc.at('page')
page['Title'] # => "System Slots"
page.at('item[@Property="Current Usage"]')['Value'] # => "Available"

item_properties = page.at('item[@Property="Characteristics"]')
item_properties.at('item[@Property="PCI slot supports SMBus signal"]')['Value'] # => "Yes"

Parsing a big XML document into memory can return a labyrinth of arrays and hashes that still have to be peeled apart to access the values you want. Using Nokogiri, you have CSS and XPath accessors which are easy to learn and read; I used CSS above but could easily have used XPath to accomplish the same things.

Upvotes: 6

Related Questions