user3434212
user3434212

Reputation: 87

Foreach loop in XML generator not breaking

I am trying to generate XML, but the loop isn't breaking. Here is a part of the code:

@key = 0
@cont.each do |pr|
  xml.product {
    @key += 1
    puts @key.to_s
    begin
      @main = Nokogiri::HTML(open(@url+pr['href'], "User-Agent" => "Ruby/#{RUBY_VERSION}","From" => "[email protected]", "Referer" => "http://www.ruby-lang.org/"))
    rescue
      puts "rescue"
      next
    end
    puts pr['href']
    puts @key.to_s
    break  //this break doesn't work
    #something else
  }
end

Most interesting is that in the final generated XML file, break worked. The file contains only one product, but on the console @key was printed fully, which means the foreach loop doesn't break.

Could it be a Nokogiri XML-specific error, because of open brackets in the head of the loop?

Upvotes: 0

Views: 288

Answers (1)

the Tin Man
the Tin Man

Reputation: 160601

In general I think how you're going about trying to generate the XML is confused. Don't convolute your code any more than necessary; Instead of starting to generate some XML then aborting it inside the block because you can't find the page you want, grab the pages you want first, then start processing.

I'd move the begin/rescue block outside the XML generation. Its existence inside the XML generation block results in poor logic and questionable practices of using next and break. Instead I'd recommend something like this untested code:

@main = []
@cont.each do |pr|
  begin
    @main << Nokogiri::HTML(
      open(@url + pr['href'])
    )
  rescue
    puts 'rescue'
    next
  end
end

builder = Nokogiri::XML::Builder.new do |xml|
  xml.root {
    xml.products {
      @main.each do |m|
        xml.product {
          xml.id_ m.at('id').text
          xml.name m.at('name').text
        }
      end
    }
  }
end
puts builder.to_xml

Which makes it easy to see that the code is keying off being able to retrieve a page.

This code is untested because we have no idea what your input values are or what your output should look like. Having valid input, expected output and a working example of your code that demonstrates the problem is essential if you want help debugging a problem with your code.

The use of @url + pr['href'] isn't generally a good idea. Instead use the URI class to build up the URL for you. URI handles encoding and ensures the URI is valid.

Upvotes: 1

Related Questions