Reputation: 1499
When using Nokogiri to parse HTML and selecting a
elements with class="favorite"
:
galleries = doc.css(".favourite a")
#doc variable contains return of Nokogiri::HTML(source_page.body)
puts galleries
returns:
<a href="/galleries/6730">...</a>
<a href="/favourites/40565414">...</a>
<a href="/galleries/10851">...</a>
<a href="/favourites/40850848">...</a>
How can I extract only /galleries/[0-9]+
values of href
attribute?
Upvotes: 0
Views: 203
Reputation: 303253
Using more Ruby and less XPath
doc.css('.favourite a').map{ |a| a['href'][%r{galleries/\d+}] }.compact
Upvotes: 1
Reputation: 14402
galleries.xpath("@href[contains(., 'galleries')]").map(&:value)
# => ["/galleries/6730", "/galleries/10851"]
Upvotes: 1