ezuk
ezuk

Reputation: 3196

Nokogiri XML to node

I'm reading a local HTML document with Nokogiri like so:

f = File.open(local_xml)
@doc = Nokogiri::XML(f)
f.close

@doc contains a Nokogiri XML object that I can parse using at_css.

I want to modify it using Nokogiri's XML::Node, and I'm absolutely stuck. How do I take this Nokogiri XML document and work with it using node methods?

For example:

@doc.at_css('rates tr').add_next_sibling(element)

returns:

undefined method `add_next_sibling' for nil:NilClass (NoMethodError)

despite the fact that @doc.class is Nokogiri::XML::Document.

For completeness, here is the markup I'm trying to edit.

<html>
<head>
<title>Exchange Rates</title>
    <link rel="stylesheet" href="style.css">
</head>
<body>
    <table class="rates">
        <tr>
            <td class="up"><div></div></td>
            <td class="date">Saturday, Jan 12</td>
            <td class="rate up">3.83</td>
        </tr>
        <tr>
            <td class="up"><div></div></td>
            <td class="date">Friday, Jan 11</td>
            <td class="rate up">3.70</td>
        </tr>
        <tr>
            <td class="down"><div></div></td>
            <td class="date">Thursday, Jan 10</td>
            <td class="rate down">3.68</td>
        </tr>
        <tr>
            <td class="down"><div></div></td>
            <td class="date">Wedensday, Jan 9</td>
            <td class="rate down">3.70</td>
        </tr>
        <tr>
            <td class="up"><div></div></td>
            <td class="date">Tuesday, Jan 8</td>
            <td class="rate up">3.66</td>
        </tr>
    </table>
</body>
</html>

Upvotes: 4

Views: 1109

Answers (2)

Ismael
Ismael

Reputation: 16720

Try to load as HTML instead of XML Nokogiri::HTML(f)

Not getting in much detail on how Nokogiri works, lets say that XML does not have css right? So the method at_css doesn't make sense (maybe it does I dunno). So it should work loading as Html.

Update

Just noticed one thing. You want to do at_css('.rates tr') insteand of at_css('rates tr') because that's how you select a class in css. Maybe it works with XML now.

Upvotes: 2

the Tin Man
the Tin Man

Reputation: 160551

This is an example how to do what you are trying to do. Starting with f containing a shortened version of the HTML you want to parse:

require 'nokogiri'

f = '
<html>
<head>
<title>Exchange Rates</title>
    <link rel="stylesheet" href="style.css">
</head>
<body>
    <table class="rates">
        <tr>
            <td class="up"><div></div></td>
            <td class="date">Saturday, Jan 12</td>
            <td class="rate up">3.83</td>
        </tr>
    </table>
</body>
</html>
'

doc = Nokogiri::HTML(f)
doc.at('.rates tr').add_next_sibling('<p>foobar</p>')

puts doc.to_html

Your code is incorrectly trying to find the class="rates" parameter for <table>. In CSS we'd use .rates. An alternate way to do it using CSS is table[class="rates"].

Your example didn't define the node you were trying to add to the HTML, so I appended <p>foobar</p>. Nokogiri will let you build a node from scratch and append it, or use markup and add that, or you could find a node from one place in the HTML, remove it, and then insert it somewhere else.

That code outputs:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Exchange Rates</title>
<link rel="stylesheet" href="style.css">
</head>
<body>
    <table class="rates">
<tr>
<td class="up"><div></div></td>
            <td class="date">Saturday, Jan 12</td>
            <td class="rate up">3.83</td>
        </tr>
<p>foobar</p>
</table>
</body>
</html>

It's not necessary to use at_css or at_xpath instead of at. Nokogiri senses what type of accessor you're using and handles it. The same applies using xpath or css instead of search. Also, at is equivalent to search('some accessor').first, so it finds the first occurrence of the matching node.

Upvotes: 3

Related Questions