Reputation: 316
My Ruby script is supposed to read in an XML doc from a URL and check it for well-formedness, returning any errors. I have a sample bad XML document hosted with the following text (from the Nokogiri tutorial:
<?xml version="1.0"?>
<root>
<open>foo
<closed>bar</closed>
</root>
My test script is as follows (url refers to the above xml file hosted on my personal server):
require 'nokogiri'
document = Nokogiri::XML(url)
puts document
puts document.errors
The output is:
<?xml version="1.0"?>
Start tag expected, '<' not found
Why is it only capturing the first line of the XML file? It does this with even with known good XML files.
Upvotes: 4
Views: 3276
Reputation: 5947
if you are getting the xml from a Nokogiri xml already, then make sure you use '.to_s' before passing it to the XML function.
for example, xml = Nokogiri::XML(existing_nokogiri_xml_doc.to_s)
Upvotes: 0
Reputation: 29463
I'm not too sure what code you are using to actually output the contents of the XML. I only see error printing code. However, I have posted some sample code to effectively move through XML with Nokogiri below:
<item>
Something
</item>
<item>
Else
</item>
doc = Nokogiri::XML(open(url))
set = doc.xpath('//item')
set.each {|item| puts item.to_s}
#=> Something
#=> Else
In general, the tutorial here should help you.
Upvotes: 3
Reputation: 3924
It is trying to parse the url, not its content. Please, take into account that first parameter to Nokogiri::XML
must be a string containing the document or an IO
object since it is just a shortcut to Nokogiri::XML::Document.parse
as stated here.
EDIT: For reading from an uri
require 'open-uri'
open(uri).read
Upvotes: 5