Paul
Paul

Reputation: 316

Problem reading XML with Nokogiri

My Ruby script is supposed to read in an XML doc from a URL and check it for well-formedness, returning any errors. I have a sample bad XML document hosted with the following text (from the Nokogiri tutorial:

<?xml version="1.0"?>
  <root>
    <open>foo
      <closed>bar</closed>
  </root>

My test script is as follows (url refers to the above xml file hosted on my personal server):

require 'nokogiri'

document = Nokogiri::XML(url) 

puts document
puts document.errors

The output is:

<?xml version="1.0"?>
Start tag expected, '<' not found

Why is it only capturing the first line of the XML file? It does this with even with known good XML files.

Upvotes: 4

Views: 3276

Answers (3)

RoundPi
RoundPi

Reputation: 5947

if you are getting the xml from a Nokogiri xml already, then make sure you use '.to_s' before passing it to the XML function.

for example, xml = Nokogiri::XML(existing_nokogiri_xml_doc.to_s)

Upvotes: 0

providence
providence

Reputation: 29463

I'm not too sure what code you are using to actually output the contents of the XML. I only see error printing code. However, I have posted some sample code to effectively move through XML with Nokogiri below:

<item>
  Something
</item> 
<item>
  Else
</item>

doc = Nokogiri::XML(open(url))
set = doc.xpath('//item')
set.each {|item| puts item.to_s}
  #=> Something
  #=> Else

In general, the tutorial here should help you.

Upvotes: 3

Serabe
Serabe

Reputation: 3924

It is trying to parse the url, not its content. Please, take into account that first parameter to Nokogiri::XML must be a string containing the document or an IO object since it is just a shortcut to Nokogiri::XML::Document.parse as stated here.

EDIT: For reading from an uri

require 'open-uri'
open(uri).read

Upvotes: 5

Related Questions