Reputation: 203
I need parse all child nodes for a parent node, however the child nodes have the same name as the parent:
<div>
<img></img>
<div>
<img></img>
</div>
<img></img>
</div>
I'm using Nokogiri with Ruby, but when I do children()
from the first div node, the parsing ends prematurely at the first div
tag. Any workarounds to this?
Upvotes: 0
Views: 1076
Reputation: 303559
Assuming that you have a starting node and want all the child nodes that have the same name, here are some options for helper methods:
# Using Ruby to Filter
def same_kind_children(node)
node.element_children.select{ |n| n.name==node.name }
end
# Using XPath to Filter
def same_kind_children(node)
node.xpath(node.name)
end
# Descendants instead of Children
def same_kind_descendants(node)
node.xpath(".//#{node.name}")
end
If you have a particular kind of node in mind and want to find every node of that type with the same-type parent:
divs_in_divs = doc.xpath('div/div')
Although it seems unlikely, if you instead don't have a particular starting node or node name in mind, but want to find all the nodes that have the same name as their parent, you could do:
same_kind_nested = doc.xpath('//*').select{ |node| node.name==node.parent.name }
Upvotes: 1
Reputation: 55012
I almost hate to say it but it sounds like another good case for traverse:
require 'nokogiri'
html = <<EOF
<div>
<img></img>
<div>
<img></img>
</div>
<img></img>
</div>
EOF
doc = Nokogiri::HTML html
doc.root.traverse do |node|
if node.parent.name == node.name
puts node
end
end
Upvotes: 1