Dantes
Dantes

Reputation: 2891

Extraction from string - Ruby

I have a string. That string is a html code and it serves as a teaser for the blog posts I am creating. The whole html code (teaser) is stored in a field in the database.

My goal: I'd like to make that when a user (facebook like social button) likes certain blog post, right data is displayed on his news feeds. In order to do that I need to extract from the teaser in the first occurrence of an image an image path inside src="i-m-a-g-e--p-a-t-h". I succeeded when a user puts only one image in teaser, but if he accidentally puts two images or more the whole thing craches. Furthermore, for description field I need to extract text inside the first occurrence inside <p> tag. The problem is also that a user can put an image inside the first tag.

I would very much appreciate if an expert could help me resolve this what's been bugging me for days.

Text string with a regular expression for extracting src can be found here: http://rubular.com/r/gajzivoBSf

Thanks!

Upvotes: 0

Views: 235

Answers (1)

Phrogz
Phrogz

Reputation: 303224

Don't try to parse HTML by yourself. Let the professionals do it.

require 'nokogiri'
frag = Nokogiri::HTML.fragment( your_html_string )
first_img_src = frag.at_css('img')['src']
first_p_text  = frag.at_css('p').text

Upvotes: 2

Related Questions