Hamptonite
Hamptonite

Reputation: 109

Using Nokogiri to find element before another element

I have a partial HTML document:

<h2>Destinations</h2>
<div>It is nice <b>anywhere</b> but here.
<ul>
  <li>Florida</li>
  <li>New York</li>
</ul>
<h2>Shopping List</h2>
<ul>
  <li>Booze</li>
  <li>Bacon</li>
</ul>

On every <li> item, I want to know the category the item is in, e.g., the text in the <h2> tags.

This code does not work, but this is what I'm trying to do:

@page.search('li').each do |li|
  li.previous('h2').text
end

Upvotes: 3

Views: 2272

Answers (3)

Mark Thomas
Mark Thomas

Reputation: 37517

You are close.

@page.search('li').each do |li|
  category = li.xpath('../preceding-sibling::h2').text
  puts "#{li.text}: category #{category}" 
end

Upvotes: 1

Martin
Martin

Reputation: 7714

Nokogiri allows you to use xpath expressions to locate an element:

categories = []

doc.xpath("//li").each do |elem|
  categories << elem.parent.xpath("preceding-sibling::h2").last.text
end

categories.uniq!
p categories

The first part looks for all "li" elements, then inside, we look for the parent (ul, ol), the for an element before (preceding-sibling) which is an h2. There can be more than one, so we take the last (ie, the one closest to the current position).

We need to call "uniq!" as we get the h2 for each 'li' (as the 'li' is the starting point).

Using your own HTML example, this code output:

["Destinations", "Shopping List"]

Upvotes: 4

The code:

categories = []
Nokogiri::HTML("yours HTML here").css("h2").each do |category|
        categories << category.text
      end

The result:

categories = ["Destinations", "Shopping List"] 

Upvotes: -2

Related Questions