Reputation: 11
I have a document like that:
<DL><a lot of tags>...<H3>Entry 1</H3><a lot of tags>...</DL>
<DL><a lot of tags>...<H3>Entry 2</H3><a lot of tags>...
<DL><a lot of tags>...<H3>Entry 21</H3><a lot of tags>...
<DL><a lot of tags>...<H3>Entry 211</H3><a lot of tags>...</DL>
</DL>
</DL>
<DL><a lot of tags>...><H3>Entry 3</H3><a lot of tags>...</DL>
I want to find all ''entry and it's easy with the follow code:
@doc=Nokogiri::HTML(@file)
@doc.css('DL>h3').each do |node| puts node.text end
how can I extract a list of H3 parents for any entries ? I'd like to have a method as 'parent' that returns the relationship, i.e.: entry211.parent ==> /Entry 2/Entry 21/
Upvotes: 1
Views: 509
Reputation: 20145
If you simply want the parent element of each h3
element
@doc.css('DL>h3').collect(&:parent)
should do the trick.
However, it looks like you might want all h3
elements that are children of a dl
element that is an ancestor of a h3
element. If I've understood that and your structure correctly you should be able to do
@doc.css('dl>h3').collect { |h3| h3.ancestors('dl').css('h3') }
This gives you an Array
containing an Array
with the h3
elements that are descendants of the dl
elements in each h3
elements ancestry. Confused? I sure am :)
For example, using your sample HTML the result for the Entry 211 h3
is
@doc.css('dl>h3').collect { |h3| h3.ancestors('dl').css('h3') }[3].collect(&:text)
#=> ["Entry 211", "Entry 21", "Entry 2"]
Is this close enough to what you want?
Upvotes: 1