nicohvi
nicohvi

Reputation: 2270

Adding a node using Nokogiri

I have an HTML string (for instance <div class="input">hello</div>) and I want to add a node only if the HTML tag in the string is a label (for instance <label>Hi</label>).

doc = Nokogiri::XML(html)

doc.children.each do |node|
  if node.name == 'label'
    # this code gets called
    span = Nokogiri::XML::Node.new "span", node
    span.content = "hello"

    puts span.parent 
    # nil

    span.parent = node
    # throws error "node can only have one parent"
  end
end

doc.to_html # Does not contain the span.

I cannot for the life of me understand what I'm doing wrong, any help would be greatly appreciated.

Edit: This solved my problem, thanks for the answers!

# notice DocumentFragment rather than XML
doc = Nokogiri::HTML::DocumentFragment.parse(html_tag)
doc.children.each do |node|
  if node.name == 'label'
    span = Nokogiri::XML::Node.new "span", doc
    node.add_child(span)
  end
end

Upvotes: 3

Views: 5577

Answers (2)

the Tin Man
the Tin Man

Reputation: 160551

It's easy to add/change/delete HTML:

require 'nokogiri'

doc = Nokogiri::HTML::DocumentFragment.parse('<div class="input">hello</div>')
div = doc.at('div')
div << '<span>Hello</span>'
puts doc.to_html

Which results in:

# >> <div class="input">hello<span>Hello</span>
# >> </div>

Notice that the above code appended a new node to the existing children of the <div>, because of <<, which means they were appended after the text-node containing "hello".

If you want to overwrite the children, you can do that easily using children =:

div.children = '<span>Hello</span>'
puts doc.to_html

Which results in:

# >> <div class="input"><span>Hello</span></div>

children = can take a single Node which can have multiple other nodes nestled under it, or the HTML text of the node(s) being inserted. That's what node_or_tags means when you see it in the documentation.

That said, to change just an embedded <label>, I'd do something like:

doc = Nokogiri::HTML::DocumentFragment.parse('<div class="input"><label>hello</label></div>')
label = doc.at('div label')
label.name = 'span' if label
puts doc.to_html
# >> <div class="input"><span>hello</span></div>

Or:

doc = Nokogiri::HTML::DocumentFragment.parse('<div class="input"><label>hello</label></div>')
label = doc.at('div label')
label.replace("<span>#{ label.text }</span>") if label
puts doc.to_html
# >> <div class="input"><span>hello</span></div>

Nokogiri makes it easy to change the tag's name once you've pointed at it. You can easily change the text inside the <span> by replacing #{ label.text } with whatever you desire.

at('div label') is one way of finding a particular node. It basically means "find the first label tag inside the first div". at means find the first of something, and is similar to using search(...).first. There are CSS and XPath equivalents to both at and search in the Nokogiri::XML::Node documentation if you need those.

Upvotes: 4

Mark Silverberg
Mark Silverberg

Reputation: 1259

A few issues - you span = .. line was creating the node but not actually adding it to the document. Also, you can't access span outside of the block where you created it.

I think this is what you're after:

html = '<label>Hi</label>'

doc = Nokogiri::XML(html)

doc.children.each do |node|
  if node.name == 'label'
    # this code gets called
    span = Nokogiri::XML::Node.new "span", doc
    span.content = "hello"
    node.add_child(span)
  end
end

# NOTE: `node` nor `span` are accessible outside of the each block

doc.to_s # => "<?xml version=\"1.0\"?>\n<label>Hi<span>hello</span></label>\n"

Note the node.add_child(span) line.

Upvotes: 2

Related Questions