user1371011
user1371011

Reputation: 203

Get child nodes with the same name as parent

I need parse all child nodes for a parent node, however the child nodes have the same name as the parent:

<div>
  <img></img>
  <div>
    <img></img>
  </div>
  <img></img>
</div>

I'm using Nokogiri with Ruby, but when I do children() from the first div node, the parsing ends prematurely at the first div tag. Any workarounds to this?

Upvotes: 0

Views: 1076

Answers (2)

Phrogz
Phrogz

Reputation: 303559

Assuming that you have a starting node and want all the child nodes that have the same name, here are some options for helper methods:

# Using Ruby to Filter
def same_kind_children(node)
  node.element_children.select{ |n| n.name==node.name }
end

# Using XPath to Filter
def same_kind_children(node)
  node.xpath(node.name)
end

# Descendants instead of Children
def same_kind_descendants(node)
  node.xpath(".//#{node.name}")
end

If you have a particular kind of node in mind and want to find every node of that type with the same-type parent:

divs_in_divs = doc.xpath('div/div')

Although it seems unlikely, if you instead don't have a particular starting node or node name in mind, but want to find all the nodes that have the same name as their parent, you could do:

same_kind_nested = doc.xpath('//*').select{ |node| node.name==node.parent.name }

Upvotes: 1

pguardiario
pguardiario

Reputation: 55012

I almost hate to say it but it sounds like another good case for traverse:

require 'nokogiri'
html = <<EOF
<div>
  <img></img>
  <div>
    <img></img>
  </div>
  <img></img>
</div>
EOF

doc = Nokogiri::HTML html
doc.root.traverse do |node|
  if node.parent.name == node.name
    puts node
  end
end

Upvotes: 1

Related Questions