user3037484
user3037484

Reputation: 25

REXML::Document.new take a simple string as good doc?

I would like to check if the xml is valid. So, here is my code

require 'rexml/document'
begin

  def valid_xml?(xml)
    REXML::Document.new(xml)
  rescue REXML::ParseException
    return nil
  end

  bad_xml_2=%{aasdasdasd}
  if(valid_xml?(bad_xml_2) == nil)
    puts("bad xml")
    raise "bad xml"
  end
  puts("good_xml")
rescue Exception => e
  puts("exception" + e.message)
end

and it returns good_xml as result. Did I do something wrong? It will return bad_xml if the string is

bad_xml = %{
     <tasks>
      <pending>

      <entry>Grocery Shopping</entry>
      <done>
      <entry>Dry Cleaning</entry>
     </tasks>}

Upvotes: 0

Views: 1198

Answers (2)

Uri Agassi
Uri Agassi

Reputation: 37409

REXML treats a simple string as a valid XML with no root node:

xml = REXML::Document.new('aasdasdasd')
# => <UNDEFINED> ... </>

It does not however treat illegal XML (with mismatching tags, for example) as a valid XML, and throws an exception.

REXML::Document.new(bad_xml)
# REXML::ParseException: #<REXML::ParseException: Missing end tag for 'done' (got "tasks")

It is missing an end-tag to <done> - so it is not valid.

Upvotes: 0

the Tin Man
the Tin Man

Reputation: 160581

Personally, I'd recommend using Nokogiri, as it's the defacto standard for XML/HTML parsing in Ruby. Using it to parse a malformed document:

require 'nokogiri'

doc = Nokogiri::XML('<xml><foo><bar></xml>')
doc.errors # => [#<Nokogiri::XML::SyntaxError: Opening and ending tag mismatch: bar line 1 and xml>, #<Nokogiri::XML::SyntaxError: Premature end of data in tag foo line 1>, #<Nokogiri::XML::SyntaxError: Premature end of data in tag xml line 1>]

If I parse a document that is well-formed:

doc = Nokogiri::XML('<xml><foo/><bar/></xml>')
doc.errors # => []

Upvotes: 1

Related Questions