Mrk Fldig
Mrk Fldig

Reputation: 4486

Ruby Nokogiri extract text after the end of a tag

I have a rather basic question here which means i'm probably missing something i'm using Nokogiri to scrape a site.

I want to extract the text AFTER the end of a strong tag within a div which looks like this:

<p style="padding-bottom:0px;"><strong>Location:</strong> Cape Town</p>

Currently my code is as follows:

location = detail_page.css('p[style="padding-bottom:0px;"]').text

Which obviously gives the <strong>Location:</strong> bit as well, is there a way to do this without using a regex?

The reason for asking is that there are other divs in the same format containing information which I need so I can't just delete the strong elements.

Thanks in advance

Marc

Upvotes: 0

Views: 733

Answers (2)

Arup Rakshit
Arup Rakshit

Reputation: 118261

Here I would do as below :

require 'nokogiri'

@doc = Nokogiri::HTML.parse('<p style="padding-bottom:0px;"><strong>Location:</strong> Cape Town</p>')
@doc.at_css('p[style*="padding-bottom:0px;"] > text()').text.strip
# => Cape Town

Upvotes: 0

matt
matt

Reputation: 79723

You could use XPath:

detail_page.xpath('//p[@style="padding-bottom:0px;"]/strong/following-sibling::text()')

This selects any text nodes that are following siblings of strong elements that are in turn children of p elements with a style attribute witht he value padding-bottom:0px;.

Upvotes: 1

Related Questions