iamjonesy
iamjonesy

Reputation: 25122

Reading an XML file with Nokogiri in Rails 3

I have a problem trying to loop through part of an XML file. I'm using Nokogiri with Rails3.

I'm reading this XML feed - http://www.ecb.europa.eu/stats/eurofxref/eurofxref-hist-90d.xml

Here is my code:

def save_rates

    # get the XML data form ECB URL
    file_handle = open('http://www.ecb.europa.eu/stats/eurofxref/eurofxref-hist-90d.xml')

    # get document xml string and create Nokogiri object
    doc = Nokogiri::XML(file_handle)

    # foreach date...
    doc.xpath("//Cube/Cube").each do |cube|

        raise cube.inspect # isn't being executed

        # foreach currency...
        cube.xpath("./Cube").each do |curr|
            # create DB entry
            Exchange.create(:currency=>curr.currency, :rate=>curr.rate, :record_date => cube.time)

        end
    end

end

When I inspect doc I can see the Nokogiri object. However when I try to raise cube.inspect inside the first .each loop it just isn't firing. So it's lead me to believe my path is wrong: //Cube/Cube.

From other example I have seen in Nokogiri tutorials the paths are similar to that. Is my path wrong or is there something else I've done wrong here?

I'm ruby n00b so please go easy!

UPDATE

Here is the format of the XML

<gesmes:Envelope xmlns:gesmes="http://www.gesmes.org/xml/2002-08-01" xmlns="http://www.ecb.int/vocabulary/2002-08-01/eurofxref">
    <gesmes:subject>Reference rates</gesmes:subject>
    <gesmes:Sender>
    <gesmes:name>European Central Bank</gesmes:name>
    </gesmes:Sender>
    <Cube>
        <Cube time="2013-02-25">
            <Cube currency="USD" rate="1.3304"/>
            <Cube currency="JPY" rate="125"/>
            <Cube currency="BGN" rate="1.9558"/>
            <Cube currency="CZK" rate="25.52"/>
            <Cube currency="DKK" rate="7.4614"/>
            <Cube currency="GBP" rate="0.8789"/>
            ...
        </Cube>
        <Cube>
        <Cube time="2013-02-24">
            <Cube currency="USD" rate="1.3304"/>
            <Cube currency="JPY" rate="125"/>
            <Cube currency="BGN" rate="1.9558"/>
            <Cube currency="CZK" rate="25.52"/>
            <Cube currency="DKK" rate="7.4614"/>
            <Cube currency="GBP" rate="0.8789"/>
            ...
        </Cube>
    </Cube>
</gesmes:Envelope>

Upvotes: 2

Views: 1758

Answers (1)

matt
matt

Reputation: 79733

The problem here is due to XML namespaces.

In the root attribute of the XML there is the attribute xmlns="http://www.ecb.int/vocabulary/2002-08-01/eurofxref", which specifies the default namespace. The Cube elements are in this namespace, and if you just use Cube without specifying a namespace you won’t get a match.

To specify the namespace in Nokogiri you can do something like this:

doc.xpath("//ecb:Cube/ecb:Cube", 'ecb' => "http://www.ecb.int/vocabulary/2002-08-01/eurofxref")

Here we’ve given the namespace the prefix ecb, and use the prefix in the XPath expression.

In this case, where the namespace is the default namespace declared on the root node, Nokogiri will declare it on the xmlns prefix for us, so we can use the simpler:

doc.xpath("//xmlns:Cube/xmlns:Cube")

Which will result in the same thing as the first.

An even simpler possibility, if you’re not interested in the namespaces, is to use the remove_namespaces! method:

doc.remove_namespaces!
doc.xpath("//Cube/Cube")

The result of this isn’t quite the same as the first two examples since the namespace information has been removed, but it will give you the nodes you are expecting.

Upvotes: 4

Related Questions